AI Audio Challenge: Audio Restoration of 78rpm Records based on Expert Examples

Hopefully we have a dataset primed for AI researchers to do something really useful, and fun– how to take noise out of digitized 78rpm records.

The Internet Archive has 1,600 examples of quality human restorations of 78rpm records where the best tools were used to ‘lightly restore’ the audio files. This takes away scratchy surface noise while trying not to impair the music or speech. In the items are files in those items are the unrestored originals that were used.

But then the Internet Archive has over 400,000 unrestored files that are quite scratchy and difficult to listen to.

The goal is, or rather the hope is, that a program that can take all or many of the 400,000 unrestored records and make them much better. How hard this is is unknown, but hopefully it is a fun project to work on.

Many of the recordings are great and worth the effort. Please comment on this post if you are interested in diving in.

12 thoughts on “AI Audio Challenge: Audio Restoration of 78rpm Records based on Expert Examples

  1. Dustin Wittmann

    This project sounds somewhat ill-suited to AI in its current form for numerous reasons–click and noise removal are not a one-size-fits-all problem, and each recording needs to be addressed on its own merits. Even the best click removal algorithms will still mistake loud brass notes for clicks because they strongly resemble clicks on a spectrogram. Anything more than moderate click cleanup inevitably screws up the music unless it’s closely monitored and done in manageable chunks. It’s far better to use current algorithms and some form of batch processing. You might be better suited to acquiring one or more CEDAR Cambridge computers with the server pack and doing batch processing on the archive. Using somewhat conservative settings would do much better than most of the “cleaned up” examples. Frankly, most of those are not very good exemplars.

    While batch processing could be very good, having a well-informed engineer work on them all individually will always be * much * better. There are no shortcuts.

    Surface noise manifests in a different pattern on every single record. AI therefore would have a very hard time drawing noise removal conclusions from other recordings. The best solution we currently have is CEDAR’s Auto-Dehiss and NR5 modules. I’m sure those could be improved–it’s probably best to collaborate with them than reinvent the wheel.

  2. Sulio Pulev

    Its already done. There are many AI approaches to that. Just some specialist need to made research through github.

  3. Mesut Sarpkaya

    I’m not tech expert but I love music.

    I can help if there will be need for the translator for Chinese and Japanese.

  4. Ken Arthur

    It is wildly unrealistic to suggest that an archive which contains 400,000 recordings can be adequately restored by assigning one specialist engineer to the task, as one reviewer proposes. If he only took half an hour to adequately process each recording (but probably it couldn’t be done properly in so short a time), he would deal with fewer than a hundred recordings each week. On that assumption, the task would take him 4,000 weeks.

    If you assign 400 volunteers, so that they would process only 1,000 recordings each, the task might be completed in ten weeks, if (probably an unrealistic assumption) they each process 100 recordings a week. But even with suitable software, an enthusiastic volunteer would not adequately replace a trained engineer.

    It is a misunderstanding to suppose that this is a task which can be done by software. CEDAR is not a software solution: it is a hardware solution. To make use of CEDAR would require re-digitising the 400,000 recordings, by having those 400,000 78-rpm discs played on the CEDAR hardware and the CEDAR output captured digitally. It is the hardware on which the disc is played that uses special techniques to minimise the amount of surface noise and distortion that gets included in the output signal: the CEDAR process is designed around an engineering concept of generating a clean signal in the first place. It is a successful concept; it is not about taking a dirty/noisy signal and cleaning it up; it is about preventing the signal from being ‘dirty’ in the first place. If the 400,000 discs were digitised without using CEDAR, the digitisation effort was largely wasted, because much better results can be achieved by doing it using CEDAR. A so-called ‘software cleanup’ of the existing audio captures will never sound as good as an initial audio capture using CEDAR.

    In my opinion, to train an AI system would fail because the signal output from a 78 rpm disc must vary depending upon the nature of the signal source, i.e. depending upon the amount and distribution of surface wear and damage: even if you reject all discs other than shellac discs, the amount of ingrained dirt on each disc, and the amount of wear on each disc (unlike pvc, shellac is brittle and suffers surface damage in direct proportion to the number of times it has been played, and the type of stylus/needle used each time), creates a nearly infinite number of variables. Too large a number of variables to be handled. You are not comparing like with like, because it is pretty much true to say that no two 78’s are ever alike.

    It is a misunderstanding to describe the system as 78 rpm. Many so-called 78s were recorded at speeds nothing like that. The discs commonly show characteristics of having been recorded at much slower speeds. 78 rpm was an ideal to aim at, but the further back in history you go before the 1950s, the less likely it is that your chosen disc actually approximates the 78 rpm target.

    Without RIAA equalisation, the discs pretty much cannot be alike. Before RIAA, a system which was mainly applied to 33 rpm vinyl, not to 78s, each record company applied its own equalization, using dozens (hundreds?) of differing standards, with different reproduction results obtained if, on playback, you did not match the recording standard used. Probably you could not achieve more than a guess as to what that standard was.

    And if you used an audio digitisation system that involves playing the discs on dozens of different machines, you introduce additional variables, not present when a single machine is used for playback.

    Under these conditions, it is misleading to describe the 400,000 recordings as a ‘single’ data set. The recordings are not comparable with each other, the discs have fundamental incompatibilities with one another, because the 78 rpm system is inherently low-fidelity, and is highly subject to sound variation from all of the problems I describe.

    1. Jon

      I think we should all just be overwhelmingly negative about this idea and not even try. Because trying new things and failure never got anyone anywhere. So there.

  5. Marco G

    I work in music restoration and vinyl, this project might be better suited for Wow and Deflutter restoration (Wow & Flutter being the variance in pitch due to the audio being on an analog medium which while moving is not 100% consistent in terms of speed, like a turntable playing a record), compared to click and pop healing these tools are in a sorry state. Talked with AI audio experts before and it sounds like a fixable problem, it’s just that nobody has done it yet.

Comments are closed.