Category Archives: Audio Archive

National Library Week 2023: Liz, donations

To celebrate National Library Week 2023, we are introducing readers to four staff members who work behind the scenes at the Internet Archive, helping connect patrons with our collections, services and programs.

Liz Rosenberg first worked with the Internet Archive in the early days of the Great 78 Project. She helped design the digitization workflow of 78rpm records and estimates transferring 30,000 sides of records herself.

The self-described “record lady,” Rosenberg said the project was the perfect entrée to the organization. She graduated from Drexel University with a degree in music industry technology, with a specialty in audio recording and production.

In 2020, Rosenberg was officially hired by the Internet Archive in patron services and later asked to lead the organization’s physical donation program. She continues with the Great 78 Project, overseeing monthly uploads, resolving metadata issues and coordinating digitization of donated collections with partners at George Blood LP.

“The Internet Archive is a place that I had always dreamed of working,” Rosenberg said. “I really looked up to the mission of the Internet Archives so when the opportunity came up to work for them directly, I couldn’t have said yes faster.”

As donations manager, Rosenberg receives inquiries from individuals and librarians about donating their physical media to the Internet Archive for preservation and digitization, from single items to collections of millions of objects. She has overseen the donations of small folk music collections, individual collectors’ passion projects, and college libraries including Bowling Green State University and the University of Hawaii. 

The individual collector contributions often are triggered by the death of a loved one. “Those tend to be sensitive situations for families,” she said. “But they are grateful to almost be able to spend time with them through the preservation of their collection and be able to go and visit whenever they want. That’s very special.”

Rosenberg keeps a “warm and fuzzy thank you file” on her computer from donors that she said keeps her motivated to encourage others to share their collections, like the message below:

Dear Liz,

You are amazing! Thank you for your kind guidance and generous ways. Seeing the dedication today has brought a difficult and costly task of storing these books over such a long period of time to this heartfelt moment and for such a worthy cause. I am in the middle of grading portfolios and preparing for a solo art exhibition so, as usual, I need to juggle the books in between. I will be in touch soon but, again, I just wanted to let you know how wonderful you and your organization are 🙂

in kindest regard, Karen

What is the most rewarding part of your job?
For me, it’s really about preserving stories. I feel such a connection to donors that I work with when I get to hear the story of how a collection was created. We want to preserve those stories alongside the media itself. And that’s really such a privilege.

What has been your greatest achievement (so far) at the Internet Archive?
Presenting on behalf of the Internet Archive at the 2022 Association for Recorded Sound Collections Conference. A recording of the presentation, as given to the Internet Archive staff shortly after the conference, can be found on the Internet Archive here.

What’s your favorite item at the Internet Archive?
This transcription recording of a child playing accordion: https://archive.org/details/78_four-leaf-clover_sonny-walikis-and-his-squeeze-box_gbia0001730a. We transferred this record without knowing who the performer was or anything about their history. The family of Sonny Walikis actually found the recording in our collection shortly after their family member had passed away and reached out to tell us the history of the recordings. I always think of this record as the best example of why we preserve media – to connect people to lost stories and help memories live on.

What’s your favorite collection at the Internet Archive?
The 78rpm record collection! archive.org/details/georgeblood

What are you reading?
The Tower of Swallows by Andrzej Sapkowski

What is your secret talent?
Morphing into a children’s choir! I was a recording studio intern and we had children booked to sing the part but they got too distracted in the booth. So I sang all of the parts slowed down 10% and we sped them up to make me sound “child-like”. The results are one of my only vocal credits: https://www.youtube.com/watch?v=WlKhVhuTiik.

New additions to the Internet Archive for July 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

Books – 78,091 New items in July

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 91,636 New Items in July

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 119 New Items in July

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 8,888 New Items in July

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 965 New Items in July

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Movies – 135 New Items in July

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

New additions to the Internet Archive for May 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

Books – 52,300 New items in May

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 89,325 New Items in May

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 92 New Items in May

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 112 New Items in May

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 807 New Items in May

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Netlabels223 New Items in May

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Movies – 110 New Items in May

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

New additions to the Internet Archive for April 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

  • Chris Cromwell Rare Reel to Reel Tapes – Rare and recovered reel-to-reel tapes from a variety of sources and preserved by Chris Cromwell. 
  • 1940s Classic TV – Television from the 1940s.
  • Game Shows Archive – A collection of game shows throughout television history, involving chance, skill and luck, usually presided over by a host and providing in-show commercials.
  • Dutch Television – Television programs and videos in the Dutch language, or from the Netherlands.

Books – 50,109 New items in April

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 150,224 New Items in April

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 99 New Items in April

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 6,745 New Items in April

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 909 New Items in April

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Netlabels111 New Items in April

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Movies – 55 New Items in April

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

New additions to the Internet Archive for March 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to borrow certain items. 

Notable new collections from our patrons: 

Books – 60,379 New items in March

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive – 93,954 New Items in March

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users. Explore.

LibriVox Audiobooks – 122 New Items in March

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages. Explore.

78 RPMs and Cylinder Recordings – 7,423 New Items in March

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century. Explore.

Live Music Archive – 1,098 New Items in March

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission). Explore.

Netlabels186 New Items in March

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres. Explore.

Movies – 25 New Items in March

Watch feature films, classic shorts, documentaries, propaganda, movie trailers, and more! Explore.

What’s New in February 2022

Here are some of the notable new additions to the Internet Archive from February 2022. (Logging in might be required to borrow certain items.)

Notable new collections: 

We’ve been reorganizing some of the items uploaded by our users, and these collections of magazines struck us as particularly interesting:

Books 45,073

This month we’ve added books in more than 20 languages. Here are a few good ones to start with:

Audio Archive 73,305

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users.

The LibriVox Free Audiobook Collection 118

Founded in 2005, Librivox is a community of volunteers from all over the world who record audio versions of public domain texts: poetry, short stories, whole books, even dramatic works, in many different languages.

78 RPMs and Cylinder Recordings 8,840

Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century.

Live Music Archive 892

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming.

Netlabels 263

The Netlabels collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of virtual record labels.

Internet Arcade 5

The Internet Arcade is a web-based library of arcade (coin-operated) video games from the 1970s through to the 1990s, emulated in JSMAME, part of the JSMESS software package. Containing hundreds of games ranging through many different genres and styles, the Arcade provides research, comparison, and entertainment in the realm of the Video Game Arcade.

New additions to the Internet Archive for January 2022

Many items are added to the Internet Archive’s collections every month, by us and by our patrons. Here’s a round up of some of the new media you might want to check out. Logging in might be required to  borrow certain items. 

Notable new collections: 

Books 40,695

This month we’ve added books on varied subjects in more than 20 languages. Click through to explore, but here are a few interesting items to start with:

Audio Archive 79,099

The audio archive contains recordings ranging from alternative news programming, to Grateful Dead concerts, to Old Time Radio shows, to book and poetry readings, to original music uploaded by our users.

The LibriVox Free Audiobook Collection 98

Founded in 2005, Librivox is a community of volunteers from all over the world who record audiobooks of public domain texts in many different languages.

 

78 RPMs and Cylinder Recordings 6,849

The Great 78 Project! Listen to this collection of 78rpm records, cylinder recordings, and other recordings from the early 20th century.

Live Music Archive 799

The Live Music Archive is a community committed to providing the highest quality live concerts in a lossless, downloadable format, along with the convenience of on-demand streaming (all with artist permission).

Netlabels 486

This collection hosts complete, freely downloadable/streamable, often Creative Commons-licensed catalogs of ‘virtual record labels’. These ‘netlabels’ are non-profit, community-built entities dedicated to providing high quality, non-commercial, freely distributable MP3/OGG-format music for online download in a multitude of genres.

Welcoming Recorded Music to the Public Domain

Every January we feature works that are entering the public domain. And this year the big story is in recorded music.

Recorded Music from 1922 and earlier

Approximately 400,000 sound recordings made before 1923 will join the public domain in the U.S. for the first time due to the Music Modernization Act (read more at copyright.gov). You can peruse about 38,000 of them in our collection of digitized 78rpm records.

By 1922 we were solidly in the Jazz Age – F. Scott Fitzgerald’s Tales of the Jazz Age was published in 1922, and the term was already in popular usage. Jazz migrated from Black American communities in New Orleans into the rest of the United States, having evolved from its roots in rag time, blues and Creole music.  In fact, 1922 was the year Louis Armstrong left New Orleans to join King Oliver’s Creole Jazz Band in Chicago.

Alexander’s Ragtime Band (1911) written by Irving Berlin and performed by Collins and Harlan

Peruse the collection to hear early jazz classics like Don’t Care Blues by Mamie Smith and her Jazz Hounds, Ory’s Creole Trombone by Kid Ory’s Sunshine Orchestra, and Jazzin’ Babies Blues by Ethel Waters.

Early recordings by Bert Williams (the first Black American on Broadway and the first Black man to star in a film), Fanny Brice (the real-life ‘Funny Girl’), Enrico Caruso (the legendary Italian operatic tenor), and so many others give life and flavor to our imaginings of the early 20th century.

Here are some of the top songs from 1922, to give you a taste:

But personally when I “flip through” these records I’m always drawn to the novelty songs

There’s a whole genre of sound imitations, like Violin Mimicry where a violin is used to imitate people talking, Jingles from the Marsh Birds with a man imitating birds imitating popular songs (just as confusing as it sounds), and A Cat-astrophe with people imitating rather catastrophic cats to music.

You can also skip the jokes and go straight to laughing just for the sake of it with these gems:  Laughs You Have Met, Gennett Laughing Record, and The Okeh Laughing Record, or choose to have a little music with laughing choruses like Ticklish Reuben, She Gives Them All the Ha-Ha-Ha, Stop Your Tickling, Jock! or And Then I Laughed.

And perhaps my favorite of the bunch is Fido is a Hot Dog Now which seems to be about a dog who is definitely going to hell.

Fido is a Hot Dog Now (1914) by Billy Murray

Other Media from 1926

As usual, we are also welcoming some new books, movies, journals, and sheet music – this time from 1926! (Read about 1925, 1924, and 1923 in previous posts.)

Some popular first edition books from 1926:

The Clothes We Wear (1926) by Frank and Frances Carpenter

Other interesting books from 1926 that you might want to explore include Show Boat by Edna Ferber which was made into the musical Show Boat in 1927 with music by Jerome Kern, The Clothes We Wear by Frank and Frances Carpenter which is a child friendly exploration of how clothes are made all the way from the field through weaving and into sewing, or The Art of Kissing by Clement Wood which is pretty self explanatory.

We invite you to explore some of the other items dated 1926 in our collections to find your own fun items that may now be in the public domain.

Virtual Party for the Public Domain

Please join us for a virtual party on January 20, 2022 at 1pm Pacific/4pm Eastern time with a keynote from Senator Ron Wyden, champion of the Music Modernization Act and a bunch of musical acts, dancers, historians, librarians, academics, activists and other leaders from the Open world! (And yes, we DO have a book from 1926 about how to throw the world’s best party.)

 Event on January 20th, 2022

REGISTER FOR THE VIRTUAL EVENT HERE!

University Professor Leverages 78rpm Record Collection From the Internet Archive for Student Podcasts

Examples of music & musicians covered by The Phono Project include, from left: John Lee Hooker, Sister Rosetta Tharpe, and Johnny Cash

When professor Jason Luther wants students in his Intro to Writing Arts class to learn about multimodal composition, he has them go to the Internet Archive for inspiration.

Students peruse 78rpm records going back to the early 20th century to find just the right one for their assignment. There is no lack of material with more than 300,000 recordings  from 1898 through the 1950s preserved. They are available to the public because of the collaborative Great 78 Project.

Although the students are enrolled at Rowan University in New Jersey, many are participating remotely from their homes this year because of the pandemic, and the materials are conveniently available digitally to them from anywhere.

Professor Jason Luther of The Phono Project.

“If the [Great 78 Project] didn’t exist, I don’t think I would have this curriculum at all,” said Luther, assistant professor for Writing Arts in the Ric Edelman College of Communication & Creative Arts at Rowan. “What I really like is the research challenge. It’s really powerful. So many times students have recovered the lost histories of these songs.”

For The Phono Project, Rowan students create podcasts and social media posts about recordings in the Archive’s 78s collection. They also tap into primary sources on the Archive to write the history of the songs. They can write about the stories behind songs like the Billie Holiday classic “God Bless the Child,” or John Lee Hooker’s “Boogie Chillen” from 1948. Many gravitate to artists like Elvis Presley or Frank Sinatra, but Luther tries to get them to branch out—especially now that there are more than 200 stories in the project’s collection.

Luther developed the project in 2018 as part of the “Technologies and Future of Writing” module in the writing course. Students have just eight classes to complete the 1-3 minute podcasts, in which they learn to master a mix of audio tools and editing skills using Audacity and WordPress. The course covers issues of compatibility and ownership, along with instruction on the economy of writing like a critic about lyrics and culture. For one recent class session, he invited Liz Rosenberg of the Archive to be a guest speaker and talk about the organization’s work and the Great 78 Project.

In the future, Luther said he would like to find more ways to incorporate some of the Archive’s collection into his curriculum. For instance, he may have students use primary source documents from independent publishers over time to craft something tangible, such as an actual history from those materials that could be passed along. “That’s one of the neat things about accretion,” he said. “We have the creativity, but then there’s also documents on the Archive that are helping us understand the 78s themselves. It’s such a vast resource.”

Visit The Phono Project.

***

Incorporating materials from the Internet Archive into your course curriculum is easy. Each semester we hear from instructors doing so worldwide. Let us know how you are weaving Internet Archive media into your classes by writing to us at info@archive.org.

Radio Ngrams Dataset Allows New Research into Public Health Messaging

Guest post by Dr. Kalev Leetaru

Radio remains one of the most-consumed forms of traditional media today, with 89% of Americans listening to radio at least once a week as of 2018, a number that is actually increasing during the pandemic. News is the most popular radio format and 60% of Americans trust radio news to “deliver timely information about the current COVID-19 outbreak.”

Local talk radio is home to a diverse assortment of personality-driven programming that offers unique insights into the concerns and interests of citizens across the nation. Yet radio has remained stubbornly inaccessible to scholars due to the technical challenges of monitoring and transcribing broadcast speech at scale.

Debuting this past July, the Internet Archive’s Radio Archive uses automatic speech recognition technology to transcribe this vast collection of daily news and talk radio programming into searchable text dating back to 2016, and continues to archive and transcribe a selection of stations through present, making them browsable and keyword searchable.

Ngrams data set

Building on this incredible archive, the GDELT Project and I have transformed this massive archive into a research dataset of radio news ngrams spanning 26 billion English language words across portions of 550 stations, from 2016 to the present.

You can keyword search all 3 million shows, but for researchers interested in diving into the deeper linguistic patterns of radio news, the new ngrams dataset includes 1-5grams at 10 minute resolution covering all four years and updated every 30 minutes. For those less familiar with the concept of “ngrams,” they are word frequency tables in which the transcript of each broadcast is broken into words and for each 10 minute block of airtime a list is compiled of all of the words spoken in those 10 minutes for each station and how many times each word was mentioned.

Some initial research using these ngrams

How can researchers use this kind of data to understand new insights into radio news?

The graph below looks at pronoun usage on BBC Radio 4 FM, comparing the percentage of words spoken each day that were either (“we”, “us”, “our”, “ours”, “ourselves”) or (“i”, “me”, “i’m”). “Me” words are used more than twice as often as “we” words but look closely at February of 2020 as the pandemic began sweeping the world and “we” words start increasing as governments began adopting language to emphasize togetherness.

“We” (orange) vs. “Me” (blue) words on BBC Radio 4 FM, showing increase of “we” words beginning in February 2020 as Covid-19 progresses

TV vs. Radio

Combined with the television news ngrams that I previously created, it is possible to compare how topics are being covered across television and radio.

The graph below compares the percentage of spoken words that mentioned Covid-19 since the start of this year across BBC News London (television) versus radio programming on BBC World Service (international focus) and BBC Radio 4 FM (domestic focus).

All three show double surges at the start of the year as the pandemic swept across the world, a peak in early April and then a decrease since. Yet BBC Radio 4 appears to have mentioned the pandemic far less than the internationally-focused BBC World Service, though the two are now roughly equal even as the pandemic has continued to spread. Over all, television news has emphasized Covid-19 more than radio.  

Covid-19 mentions on Television vs. Radio. The chart compares BBC News London (TV) in blue, versus BBC World Service (Radio) in orange and BBC Radio 4 FM (Radio) in grey.

For now, you can download the entire dataset to explore on your own computer but there will also be an interactive visualization and analysis interface available sometime in mid-Spring.

It is important to remember that these transcripts are generated through computer speech recognition, so are imperfect transcriptions that do not properly recognize all words or names, especially rare or novel terms like “Covid-19,” so experimentation may be required to yield the best results.

The graphs above just barely scratch the surface of the kinds of questions that can now be explored through the new radio news ngrams, especially when coupled with television news and 152-language online news ngrams.

From transcribing 3 million radio broadcasts into ngrams to describing a decade of television news frame by frame, cataloging the objects and activities of half a billion online news images, to inventorying the tens of billions of entities and relationships in half a decade of online journalism, it is becoming increasingly possible to perform multimodal analysis at the scale of entire archives.

Researchers can ask questions that for the first time simultaneously look across audio, video, imagery and text to understand how ideas, narratives, beliefs and emotions diffuse across mediums and through the global news ecosystem. Helping to seed the future of such at-scale research, the Internet Archive and GDELT are collaborating with a growing number of media archives and researchers through the newly formed Media Data Research Consortium to better understand how critical public health messaging is meeting the challenges of our current global pandemic.

About Kalev Leetaru

For more than 25 years, GDELT’s creator, Dr. Kalev H. Leetaru, has been studying the web and building systems to interact with and understand the way it is reshaping our global society. One of Foreign Policy Magazine’s Top 100 Global Thinkers of 2013, his work has been featured in the presses of over 100 nations and fundamentally changed how we think about information at scale and how the “big data” revolution is changing our ability to understand our global collective consciousness.