How to play and play with thousands of digitized 78rpm records


There are over
50k uploaded recordings from 78’s from users, and there are now 10’s of thousands of high-bitrate unrestored transfers of 78’s that are part of the Great 78 Project.

With this many, it gets hard to find things you want to explore.  Here are some techniques I do:

Again, I recommend the “play items” link as it plays along like youtube does.

To download for research and preservation purposes:

  • Download right hand side of a “details” page, and you can click to see the whole list of files.
  • There is the “best” stylus version (according to an audio engineer at George Blood Co) that is renamed to be itune-ish compatible, but all the stylii recordings are there in flat and equalized formats, and each of those in FLAC and MP3 formats.

To download many records for research and preservation purposes (requires linux or mac and command line skills):

  • Install the Internet Archive command line interface
  • To download metadata in json from our 78 transfers, in bash:
    • for item in `./ia search “collection:georgeblood” –itemlist`; do curl -Ls https://archive.org/metadata/$item/metadata ; echo “”; done
    • I installed gnu parallel to speed things up (I use “brew install parallel” on a mac)
    • ./ia search “collection:georgeblood” –itemlist | parallel -j10 ‘curl -Ls https://archive.org/metadata/{}/metadata’ > 78s.json
  • To download all of files of the high bitrate transfers (and is repeatable to update based on failures or new additions):
    • ./ia download –search=”collection:georgeblood”      (14TB at this point)
  • To download only the metadata and Flac’s:
    • ./ia download 78_–and-mimi_frankie-carle-and-his-orchestra-gregg-lawrence-kennedy-simon_gbia0006176a –format=”24bit Flac” –format=”Metadata”
  • To download only the metadata and mp3’’s of all ragtime recordings:
    • ./ia download –search=”collection:georgeblood AND ragtime” –format=”VBR MP3″ –format=”Metadata”

If you want to do more on downloading specific sets, I suggest the documentation or joining the slack channel.

How You can Help: Please help find dates for these 78’s

We are doing some by automatically matching against 78discography.com and discogs.com, but many are done by hand, finding entries in billboard magazine and on DAHR the like.  But many still need dates.

If you would like to help, then please do research and post your findings in the review of a 78rpm record, citing your sources.  Then someone with privileges will change the metadata in the item.

The complete collection has date facets on the left reflecting the dates we have found.  But we only have dates for about half, and there are thousands of 78rpm sides posted each month so we need help!

To find what others have done, you can list them in the order of the most recent reviews.

The most recent ones that do not have a date nor a review are here.  This is a good starting place.

But again, these need dates.  If you tried and could not find anything online, then please post a review to that effect so others do not spend time on the same one.   

If you find other information, or know other information about the performer, performance, or piece, please put it in.  Also links to youtube, wikipedia, and old magazines like cashbox and billboard.  

For those that get into it, we invite you to join the slack channel (a great tool if you have not used it already), then that is where there is some discussion.   Caitlin@archive.org can set you up.

Oh, and I have gotten a bit obsessed, and this is a twitter feed of a digital transfer every 10 minutes which I visit more times than I should probably.

Restoration techniques:

I have been using Dartpro MT – I like it because it has a “Filter builder” –
the program doesn’t like 24 bit but I transfer at 24/96,000 , resample to
16/96000 then decrackle starting with a setting of 50 repeating the process
with an increase each time by another 10 i.e. 50, 60, 70, 80 (maximum) if
more noise is still there, I run repeatedly at 80 until the reported
interventions get to a number of 4000 or so. I can then manually remove any
clicks that are left.

I don’t use the denoise or dehiss, preferring to use declick  at very low
settings (78’s don’t have hiss – what you hear as hiss is the combination of
many little clicks.

I start with a setting of 2 then 4 then 6 . I leave the settings the same
then just find whether 1 or 2 or 3 passes will polish the higher noise away.
Decrackle doesn’t affect the high frequencies but declick does. This process
takes a little time but I almost never find that any distortion is
introduced to the sound.  I hate getting to the end of the record where the
really growly trumpet sound is a distorted mess, and this workflow prevents
that. – Mickey Clark

My “go to” software for restoration work is Izotope. However, I also have Adobe Audition, Diamond Cut, Pro Tools, Samplitude, and Sound Forge available for specific situations. As Ted Kendall points out: Hearing and Judgment play the major roles in both transfer and restoration. To that I would add Experience. And, as always, start with the best possible source.

Get Involved

Please write to the Internet Archive’s music curator  bgeorge@archive.org or more generally to info@archive.org .

Please join this project to:

  • Share knowledge. Help improve the metadata, curate the collection, contact collectors, do research on the corpus, etc.
  • Include your digitized collection. If you have already digitized 78s or related books or media, we’d like to include your work in the collection.
  • Digitize your collection.  We’ve worked hard to make digitization safe, fast and affordable, so if you’d like to digitize your collection we can help.
  • Donate 78s.  We have 200,000 78s, but we are always looking for more.  We will digitize your collection and preserve the physical discs for the long term.

If you are in the bay area of California, we can also use help in packing 78’s for digitization and please come over for a lunch on a Friday.