As mentioned in our recent Building Music Libraries post, we are working with researchers at Columbia University and UPF in Barcelona to run their code on the music collection to help their research and to provide new analyses that could help with exploration and understanding.
We are doing some pilot runs to generate files which some close observers may see in the music item directories on archive.org. Audio fingerprints from audfprint are .afpt and music attributes from Essentia are in _esslow.json.gz (download sample) and _esshigh.json.gz.
We are also creating image files showing the audio spectrum used. We hope this is useful for those that want to see if files have been compressed in the past (even if they are posted as flac files now). There is also a .png for each audio file of a basic waveform that is being used in the archive’s beta site as eye candy.
More as it happens, but we wanted you know there is some progress and you will see some new files. If you have proposed other analyses that would benefit from being run over a large corpus, please let us know by contacting info at archive dot org.
Thank you to the researchers and the Archive programmers who are working together to make this happen.