Category Archives: Live Music Archive

Audio / Video player updated – to jwplayer v8.2

We updated our audio/video (and TV) 3rd party JS-based player from v6.8 to v8.2 today.

This was updated with some code to have the same feature set as before, as well as new:

  • much nicer cosmetic/look updates
  • nice “rewind 10 seconds” button
  • controls are now in an updated control bar
  • (video) ‘Related Items’ now uses the same (better) recommendations from the bottom of an archive.org /details/ page
  • Airplay (Safari) and Chromecast basic casting controls in player
  • playback speed rate control now easier to use / set
  • playback keyboard control with SPACE and left , right and up, down keys
  • (video) Web VTT (captions) has much better user interface and display
  • flash is now only used to play audio/video if html5 doesnt work (flash does not do layout or controls now)

Here’s some before / after screenshots:

archive.org download counts of collections of items updates and fixes

Every month, we look over the total download counts for all public items at archive.org.  We sum item counts into their collections.  At year end 2014, we found various source reliability issues, as well as overcounting for “top collections” and many other issues.

archive.org public items tracked over time

archive.org public items tracked over time

To address the problems we did:

  • Rebuilt a new system to use our database (DB) for item download counts, instead of our less reliable (and more prone to “drift”) SOLR search engine (SE).
  • Changed monthly saved data from JSON and PHP serialized flatfiles to new DB table — much easier to use now!
  • Fixed overcounting issues for collections: texts, audio, etree, movies
  • Fixed various overcounting issues related to not unique-ing <collection> and <contributor> tags (more below)
  • Fixes to character encoding issues on <contributor> tags

Bonus points!

  • We now track *all collections*.  Previously, we only tracked items tagged:
    • <mediatype> texts
    • <mediatype> etree
    • <mediatype> audio
    • <mediatype> movies
  • For items we are tracking <contributor> tags (texts items), we now have a “Contributor page” that shows a table of historical data.
  • Graphs are now “responsive” (scale in width based on browser/mobile width)

 

The Overcount Issue for top collection/mediatypes

  • In the below graph, mediatypes and collections are shown horizontally, with a sample “collection hierarchy” today.
  • For each collection/mediatype, we show 1 example item, A B C and D, with a downloads/streams/views count next to it parenthetically.   So these are four items, spanning four collections, that happen to be in a collection hierarchy (a single item can belong to multiple collections at archive.org)
  • The Old Way had a critical flaw — it summed all sub-collection counts — when really it should have just summed all *direct child* sub-collection counts (or gone with our New Way instead)

overcount

So we now treat <mediatype> tags like <collection> tags, in terms of counting, and unique all <collection> tags to avoid items w/ minor nonideal data tags and another kind of overcounting.

 

… and one more update from Feb/1:

We graph the “difference” between absolute downloads counts for the current month minus the prior month, for each month we have data for.  This gives us graphs that show downloads/month over time.  However, values can easily go *negative* with various scenarios (which is *wickedly* confusing to our poor users!)

Here’s that situation:

A collection has a really *hot* item one month, racking up downloads in a given collection.  The next month, a DMCA takedown or otherwise removes the item from being available (and thus counted in the future).  The downloads for that collection can plummet the next month’s run when the counts are summed over public items for that collection again.  So that collection would have a negative (net) downloads count change for this next month!

Here’s our fix:

Use the current month’s collection “item membership” list for current month *and* prior month.  Sum counts for all those items for both months, and make the graphed difference be that difference.  In just about every situation that remains, graphed monthly download counts will be monotonic (nonnegative and increasing or zero).

 

 

Music Analysis Beginnings

As mentioned in our recent Building Music Libraries post, we are working with researchers at Columbia University and UPF in Barcelona to run their code on the music collection to help their research and to provide new analyses that could help with exploration and understanding.

We are doing some pilot runs to generate files which some close observers may see in the music item directories on archive.org.  Audio fingerprints from audfprint are .afpt and music attributes from Essentia are in _esslow.json.gz (download sample) and _esshigh.json.gz.

Spectrogram of a Grateful Dead track

Spectrogram of a Grateful Dead track

We are also creating image files showing the audio spectrum used.  We hope this is useful for those that want to see if files have been compressed in the past (even if they are posted as flac files now).  There is also a .png for each audio file of a basic waveform that is being used in the archive’s beta site as eye candy.

More as it happens, but we wanted you know there is some progress and you will see some new files.  If you have proposed other analyses that would benefit from being run over a large corpus, please let us know by contacting info at archive dot org.

Thank you to the researchers and the Archive programmers who are working together to make this happen.

 

Building Music Libraries

The Internet Archive is working with partners to preserve our musical heritage. The music collections started 8 years ago with the etree.org live music recordings and grew when we started hosting netlabels.

Scanning an LP cover

Scanning an LP cover

Now through new efforts and partnerships we have begun to expand and explore the music collections further.  We are working with researchers, record labels, collectors, internet communities and other archives to gather music media, build tools for preservation and expand metadata for exploration.

We have already made tremendous progress. We have archived millions of tracks, we are working with the Archive of Contemporary Music to digitize portions of their extensive collections of physical media, the MusicBrainz.org community has provided meticulous metadata, and researchers from university programs have begun to analyze the music.

Listening Room

Listening Room

A prototype “listening room” in the Internet Archive’s building in San Francisco is available free to the public to listen to the full musical holdings.  Access to these collections will also be provided to select computer science researchers via a secure “virtual reading room” in our data center.  As tools and the collections grow, we will offer everyone access to the metadata to help them explore, and then offer links to commercial sites for listening or purchasing.

We invite interested people to participate:

Archives. The Internet Archive and the Archive of Contemporary Music in New York have started digitizing ACM’s holdings with consistent, high quality, standards-based methods to build a scalable workflow.  We welcome other archives with similar projects, or who would like to help.  “Digitizing our large physical collections is an important step for our archive to allow others to learn from this deep legacy,” said Bob George, Director of the Archive of Contemporary Music, NYC.

ACMdigitization

Digitizing CDs at the Archive of Contemporary Music

Collectors.  Digitize, donate, or lend material for digitization.  Improve metadata or provide context to help others understand the depth and cultural relevance of these collections.  “Recycled Records is happy to have directed the donation of many thousands of LPs to the Internet Archive to help with their projects and for the love of music,” Bruce Lyall, proprietor of Recycled Records.

Labels.  Preserving a complete collection of everything published by a label is best done by or with the record label.  We would like to work with labels to get their releases archived and properly cataloged.  “The upcoming Music Libraries program continues the very work that enables our label, and the musicians who record for us, to bring the music of earlier times to audiences today. We are proud to participate in a tradition of preservation that has brought joy to so many through music.”  said David Fox, Co-founder of Musica Omnia.

Cataloging services.  Commercial and non-commercial cataloging services can participate by making sure there are proper links from and to these collections.  The musicbrainz.org open, community-created catalog has already been very helpful.

ellisquote

Commercial vendors and streaming services.  Links from these collections to commercial services can help users buy and listen to full tracks.  These services might have valuable metadata as well that can help users navigate.

Musicians and bands.  Please create more great works that libraries can preserve and provide access to.  We would like to hear your ideas about making the site useful for both musicians and the general public.

Researchers, historians, and music lovers.  Annotate, organize, datamine, and surface music in the collections, and help us preserve those works not yet in the collections.  “Access to a comprehensive archive of commercial music audio is the key missing link for research relating signal processing to listener behavior,” said Daniel Ellis, professor at Columbia University.  By analyzing the rhythms, keys, instruments, and genres, researchers will help create more complete metadata and aid discovery.

Looking to the future, we hope to expand these shared music collections by uniting the work done by other archives and collectors.  By bringing all of this music and its metadata into a shared library, we hope to bring the richness of our musical heritage to people all over the world.

Visit the Listening Room

Internet Archive
300 Funston Ave
San Francisco, CA 94118
Hours: Fridays from 1-4pm, or by appointment.

If you would like to participate in any way, please email us.

new video and audio player — video multiple qualities, related videos, and more!

Many of you have already noticed that since the New Year, we have migrated our new “beta” player to be the primary/default player, then to be the only player.

We are excited about this new player!
It features the very latest release of jwplayer from longtailvideo.com.

Here’s some new features/improvements worth mentioning:

  • html5 is now the default — flash is a fallback option.  a final fallback option for most items is a “file download” link from the “click to play” image
  • videos have a nice new “Related Videos” pane that shows at the end of playback
  • should be much more reliable — I had previously hacked up a lot of the JS and flash from the jwplayer release version to accommodate our various wants and looks — now we use mostly the stock player with minimal JS alterations/customizations around the player.
  • better HD video and other quality options — uploaders can now offer multiple video size and bitrate qualities.  If you know how to code web playable (see my next post!) h.264 mp4 videos especially, you can upload different qualities of our source video and the viewer will have to option to pick any of them (see more on that below).
  • more consistent UI and look and feel.  The longtailvideo team *really* cleaned up and improved their UI, giving everything a clean, consistent, and aesthetically pleasing look.  Their default “skin” is also greatly improved, so we can use that now directly too
  • lots of cleaned up performance and more likely to play in more mobile, browsers, and and OS combinations under the hood.

Please give it a try!

-tracey

 

For those of you interested in trying multiple qualities, here’s a sample video showing it:

http://archive.org/details/kittehs

To make that work, I made sure that my original/source file was:

  • h.264 video
  • AAC audio
  • had the “moov atom” at the front (to allow instant playback without waiting to download entire file first) (search web for “qt-faststart” or ffmpeg’s “-movflags faststart” option, or see my next post for how we make our .mp4 here at archive.org)
  • has a > 480P style HD width/height
  • has filename ending with one of:   .HD.mov   .HD.mp4   .HD.mpeg4    .HD.m4v

When all of those are true, our system will automatically take:

  • filename.HD.mov

and create:

  • filename.mp4

that is our normal ~1000 kb/sec “derivative” video, as well as “filename.ogv”

The /details/ page will then see two playable mpeg-4 h.264 videos, and offer them both with the [HD] toggle button (seen once video is playing) allowing users to pick between the two quality levels.

If you wanted to offer a *third* quality, you could do that with another ending like above but with otherwise the same requirements.  So you could upload:

  • filename.HD.mp4       (as, say, a 960 x 540 resolution video)
  • filename.HD.mpeg4   (as, say, a 1920 x 1080 resolution video)

and the toggle would show the three options:   1080P, 540P, 480P

You can update existing items if you like, and re-derive your items, to get multiple qualities present.

Happy hacking!

 

 

 

getting only certain formats in .zip files from items — new feature

Per some requests from our friends in the Live Music Archive community…

You can get any archive.org item downloaded to your local machine as a .zip file (that we’ve been doing for 5+ years!)
But whereas before it would be all files/formats,
now you can be pick/selective about *just* certain formats.

We’ll put links up on audio item pages, minimally, but the url pattern is simple for any item.
It looks like (where you replace IDENTIFIER with the identifier of your item (eg: thing after archive.org/details/)):

http://archive.org/compress/IDENTIFIER

for the entire item, and for just certain formats:

http://archive.org/compress/IDENTIFIER/formats=format1,format2,format3,….

Example:


wget -q -O - 'http://archive.org/compress/ellepurr/formats=Metadata,Checksums,Flac' > zip; unzip -l zip
Archive: zip
Length Date Time Name
--------- ---------- ----- ----
1107614 2012-10-30 19:49 elle.flac
44 2012-10-30 19:49 ellepurr.md5
3114 2012-10-30 19:49 ellepurr_files.xml
693 2012-10-30 19:49 ellepurr_meta.xml
602 2012-10-30 19:49 ellepurr_reviews.xml
--------- -------
1112067 5 files

Enjoy!!

new audio/video player — safari/IE improvements

below the current audio/video player on archive.org you have probably seen by now the link:

Would you like to try our new audio/video player? (beta!)

We had some known problems in this beta rollout that affected audio MP3 playback.

Specifically, on Safari, some 30-70% of the time (and it varied widely) the MP3 loading/setup would fail.  This has been fixed.   On Internet Explorer, we didn’t have the MP3 “flash based playback” option setup using the new audio player — and the lead developer, Michael Dale, came over today and fixed that for us.   Hooray!

So at this point, I believe the audio/video player is true “beta” — feature complete with a few things to smooth out left but the finish line is close:

1) i need to add back in captions/subtitles (it’s there in the player, just need to feed them through with our playlist)

2) video items with 3+ videos may play the last video 2x.  working on that!  😎

hopefully, we can all listen to some nice archive music this weekend in peace without issues with this new player!  now grab your headphones or turn up those speakers…

-tracey

Milestone: 50,000 free live music concert

We have hit a major milestone:
50,000 individual shows by 2924 bands
are now freely downloadable and streamable from
http://www.archive.org/details/etree

What a success for the commons!

(from Tyler’s post: http://www.archive.org/iathreads/post-view.php?id=195573 )

Project starts in aug/sept 2002 …

10,000 shows
March 26, 2004
http://www.archive.org/iathreads/post-view.php?id=13564

16,000
Aug 31, 2004
http://www.archive.org/iathreads/post-view.php?id=21086

20,000
Feb. 3, 2005
http://www.archive.org/iathreads/post-view.php?id=28212

25,000
July 21, 2005
http://www.archive.org/iathreads/post-view.php?id=39351

30,000
Feb. 13, 2006
http://www.archive.org/iathreads/post-view.php?id=56015

35,000
May 13, 2006
http://www.archive.org/iathreads/post-view.php?id=61084

40,000
June 9, 2007
http://www.archive.org/iathreads/post-view.php?id=128966

45,000
December 3, 2007
http://www.archive.org/iathreads/post-view.php?id=168428

50,000
June 4, 2008

I predict 55,000 by January 1, 2009. Let’s keep this project rolling! 60,000 by next memorial day? we’ll see! so much of the recorded library out there for all bands is already up on archive.org, the next 10,000 would have to be mostly new stuff. get out there and ask your favorite band if they are okay with sharing their live recordings on archive.org’s Live Music Archive project!

20,000 Live Music Archive Concerts!

Through an outpouring of good work of bands, tapers, and fans, we have hit a major milestone: 20,000 live music concert recordings are being freely shared and enjoyed.

-brewster

A cool report from Parker:

I’d take special note of a few statistics that are particularly interesting. The shows have doubled from 10k in 13 months, the number of archive-friendly bands has almost doubled. Also interesting, is the proportion of shows to venues and tapers. Our ratio of shows to tapers is only about 5:1, and our ratio of shows to venues is less than 3:1. Given that the big guys like the Grateful Dead and Phil Lesh skew this number upwards, it really shows that the Live Music Archive is a grassroots effort.

Use:

average bandwidth: 650mb/sec
shows accessed: 8,876,402

User Stats:

uploaders: 1381
uploaders with more than 10 shows uploaded: 380
tapers: 3998

Concert Stats:

shows: 20,141
shows with mp3s (more as space allows): 6981
tracks: ~300,000
bands: 856
bands disallowing mp3: 44
venues: 7713

Machine Stats:

total space for live concerts: ~25tb
machines: 74 including backups

pt.
——————–
Parker Thompson
Data Archivist
The Internet Archive

Live Music Archive at 10,000 Concerts

Yippie!

The Live music archive just received its 10,000th concert recording!
Congratulations and thanks to the etree community and the artists and
bands that have a made this a fantastic repository of creative works.
http://www.archive.org/audio

A few stats:
Now stored: 10,000 shows, 150,000 tracks, from almost 500 bands, in 18 months.
About 10 Terabytes of information on 40 Terabytes of disk space on 60 linux boxes.
almost 1,000 uploaders, 300 uploaders have uploaded more than 10 shows,
Brad Lablanc is the top uploader at 251concerts
Usage: about 1`Petabyte of concerts have been downloaded
(1,000,000,000,000,000 bytes)
Or about 1 Million shows.
Currently 400megabits/sec of outbound bandwidth.

In the next couple of weeks:
Adding some Grateful Dead concerts because their policy allows
non-commercial hosting.
Converting most of the concerts to mp3/ogg and full concert zip files

Longer term wish list:
Broaden the collection into lots of different types of music
Internet Radio interface
Improved website by searching, speed, browsing

If you have other ideas, please write to us on the forums or to
etree@archive.org

Great stuff! Thank you to all that have put in so many hours. This
is a partial list:

Brad Leblanc
Bram Cohen
Caleb Epstein
Diana Hamilton
Ghost
Greg Pope
Lauren Gelman
Matt Vernon
Ry4an Brase
Tom Anderson
Tom Horton
Tyler Huff

enjoy,

-brewster