New strategy for Internet Archive movies!

We have rebuilt all of our nearly 200,000 videos at the archive!

[We finished this Dec 1, 2008]

Related cross-blog with OLPC.

Here is a table-based chart of which video formats will be “derived” into which formats (we are creating 4 formats per video now):

Improvements and Changes from our prior movies techniques:

  • We will make a new Ogg Theora (with Vorbis audio) opensource/free-based video derivative. This derivative will play natively in Firefox 3.1 release (v3.1 is due around the end of 2008).
  • We are re/making h.264 MPEG-4 derivatives. We have updated the format to work with lighttpdmod_h264_streaming” (which allows jumping into a movie at a specified time) but in the process will be losing the ability to serve/stream this file with RTSP.  This derivative also plays in the Adobe Flash plugin and plays on iPods/iPhones.
  • We are removing older 64kb and 256kb MPEG-4 derivatives.  With “progressive download” support becoming ubiquitous, even modems and phones are doing much better with downloading larger files.
  • We are removing older .flv “Flash Video” derivatives.  Since the much better quality h.264 derivative plays in recent flash plugins (as well as many other devices and browsers), the flash video alternative is seen as less ideal.
  • We are removing older .mpg MPEG-1 derivatives.  Their usefulness has declined in recent years, especially compared to h.264 alternatives.
  • We are remaking our animated GIFs. They attempt to make 30 thumbnails from each uploaded video.  We now evenly space them across the entire video.
  • We are remaking our Thumbnails. Similar to the GIF, we are spreading them across the videos better, and making less Thumbnails for items with *many* videos.  Additionally, we are renaming the thumbnails to indicate the second position in the video they were created at.  This will allow for the next bullet item…
  • We have developed the ability to jump into videos by clicking on the thumbnail image (to go to that scene!) We are finalizing the URL / permalinks for these “jump into video” URLs and will release this live to the public as soon as we can.

-tracey jaquith

28 thoughts on “New strategy for Internet Archive movies!

  1. Pingback: The Internet Archive’s New Video Strategy (Ogg, H.264) | Phil Nelson Writes Here

  2. Andrew

    I knew subscribing to this would bring some technical news now and again, neat! The IA needs more news, perhaps from the collection curators too.

    Anyway, great work Tracey!

    Is there any numbers on the amount of videos and the time it’ll take to re-derive everything? I mean, it took a while to do past entries, and there must be tens of thousands of videos stored on the IA’s servers.

    I ask because I’ll have to try out the embedded flash player once it’s all done – it kept “not working” (embedding on other sites) for me and having it working 100% (especially the jumping around and loading problems it previously had) would be awesome, and makes youtube is redundant for entries uploaded to the IA.

    Oh, and just out of interest, does it in fact save some disk space? since you’re removing the unnecessary old ones, I did wonder.

    Here’s to viewing higher quality derivatives! Hooray 🙂

  3. Andrew

    Oh, and I like the thumbnail thing. Good job on that! can’t wait to try it.

    While there are always issues with some source files which can’t be helped (among them is, of course, multiple audio tracks, or two copies of the same video uploaded at different qualities or in different formats which have similar names – nothing major though), that solves one major issue of massively-long videos with variable scene content, where someone who knows what they are looking for can jump forwards, like scene markers in a DVD.

  4. Eight-Bit Bandit

    I wish I could talk you out of removing the old MPEG-1 files. Oh, they’re dated alright, but they’re the only thing my Windows Movie Making Software can read (it has trouble with MPEG-2’s). I wouldn’t ask you to go to the trouble of making MORE MPEG-1’s, but could you be persuaded to leave the ones you have? Please?

  5. Pingback: Internet Archive to remove some "derived" video formats : Real Worcester - Worcester News and Blogs

  6. Mike Benedetti

    I’m very excited about these changes. But I’m disappointed that the old, lo-fi files are disappearing. I have a number of links to the 256kbps mp4s, and embedded players that use the old flvs. I’m sure the Archive has thought these things through, but FWIW, I would love it if the extant lo-fi files stuck around.

  7. redjade

    Is there a central email to techy people that are involved in the audio side of things that I can talk to?

    The new flash player has some problems with it that I would like explain.


  8. Publisher who's Peeved

    Why are the FLV files no longer active? All references to these movies are now dead, and I’m wondering why the admin didn’t allow the old files to remain?

    It seems that the archive site is violating best practice, and that sites that link to movies are going to have to stop relying on things being “there”….

    Is there a site where our national archives are not going to be changed for “fun” and amusement?

  9. Ahrvid Engholm

    You say: “We are removing older 64kb and 256kb MPEG-4 derivatives. With “progressive download” support becoming ubiquitous, even modems and phones are doing much better with downloading larger files.”
    For me this is no “improvement”. I download and watch the movies on a small sub-notebook with a tiny disk, and the 64kb versions of movies were perfect for that, taking up little space, having a reasonable resolution for my small screen.
    Please reinstate the 64k movies!


  10. tracey jaquith

    OK, so we didn’t do this “just for fun” — we consulted with a lot of opensource video folks and groups. This also wasn’t a flippant decision. We spent months working on it and getting it setup.

    We got rid of the .flv and .mpg and lower quality .mp4 videos to save space and stop supporting the use of them. We have left all original uploaded quality versions of the movies on our site, and will always continue to do so.

    Every once in awhile we change the way we make “derivatives” so that’s what we did here.
    We’ve been adding and changing the way we make new derivatives for new movies for awhile, on and off. This is the first time in 5 years that we’ve comprehensively gone through all the movies and actually removed derivative formats like this.

    We are sorry for any inconvenince, truly!
    I know some of you had linked/embedded versions of our .flv derivatives. They can be updated relatively simply, at least, to change the “.flv” suffix to “_512kb.mp4” and not only will the flash plugins “just work” — they’ll be significantly higher quality/looking for the amount of bytes.

    As an archive, we will always leave the orginal files alone.
    Occasionally, like we did this time, we will change the way we make “user accesssible” derivatives of the best practices of the time.


  11. Andrew

    I’d still be interested in the size stats if they were around 😉 (I just sometimes wonder “how big?” when I visit a root directory on a server!) I still say good work too 🙂

    Maybe since you made use of the blog this time, using some news outlet (of which this seems to be the only one right now?) the IA has to announce in advance would help the other guys.

  12. Pingback: Software Livre no SAPO » Blog Archive » Arquivo Vídeo da Internet em formato Ogg/Theora

  13. Pingback: Internet Archive converte vídeos para OGG/Theora | Open Mania

  14. Pingback: Metavid Blog » Ogg support & The future of Interoperable Archives.

  15. Pingback: Internet Archive adopts Ogg/Theora, Firefox and OLPC loves it |

  16. ianu

    I have test downloaded some of the ogg files and find that they are markedly degraded both in picture and audio, using VLC player, when compared to the MP4’s. (e.g.,ClassicTV/Studio One/The_Laugh_maker). Ogg is a poor project with few players able to play it and its is likely to stay that way. The should think about ther primary aim as an archive and stop trying to spread open source.

    It also appears that at the same time as the ogg conversion project have stopped their servers from downloading MP4 files with restart capability or accelerated multisteaming (but not other file types like avi, ogv or zip). As a result it has become impossible to download MP4 files if they take more than a few minutes. Seems like a rabid open sourcer is trying to force the use of his rubbish to the detriment of the purposes of the archive.

  17. Hampton

    I commend in having the forward vision in choosing a media format best suited for digital preservation. This seems to me a very practical decision.

  18. Elwood P.

    Best wishes with your conversion project. It sounds like quite a task! The modifications to the thumbnails sound interesting. If I may add my 2 cents, would it be possible to turn off the display (or stop the animation) of thumbnails. They are very annoying when watching the movie via the embedded player – like watching two movies at once.

  19. wytchcroft

    Happy new year – appreciate your efforts – ARCHIVE is the eighth wonder of the world! (kinda).

  20. Bill Barstad

    I think making H.264 MPEG-4 derivatives is a great idea. The old 256KB MPEG-4s looked terrible. The new MPEG-4 files available, however, still show compression artifacts when viewed full-screen on my 20″ iMac. I think adding a H.264 MPEG-4 derivative encoded at a higher bitrate would solve this problem and would effectively replace the old MPEG-1 derivatives I used to download.

    I agree with ianu about the relatively poor quality of the Ogg-Vorbis files.

  21. Samuda

    dreaming of day when it will be possible to have a RSS feed of media based on keyword searches. . . add to any links off of media pages encouraging people to visit and support the archive resources. . .
    glad to have found the blog

  22. Peter

    Ogg is a poor project with few players able to play it and its is likely to stay that way.

    On the contrary, Ogg is excellent and virtually all players can play Ogg content. The quality of the IA encoder leaves a great deal to be desired, however – the reference encoder is orders of magnitude better than the ffmpeg one IA are using; the problem is the sucky encoder (IA: please fit it!), not the format.

    1. internetarchive

      @Peter — we are using the reference encoder for the video and have all along. we started using the reference encoder for the audio portion late in the game but all new videos are getting the better audio (since apr 15).
      –tracey jaquith

  23. parvati

    The thumbnails are very nice!

    What is the command like that you use in ffmpeg to create the gif thumbnails?

  24. Dave K

    Hi Tracey, I’m dealing with the issue of “digital legacy” for families who are now accumulating terabytes of digital content, especially videos. could be influential in helping everyone deal with the growing problem of preserving this content for future generations. I would like to see an open source reference standard for a “best” archival video file format. If can agree on what that format is, then we in the community can archive and preserve our digital content in a format that we know will be supported over the long term. Perhaps you already have something like this, and if so, I’d appreciate very much if you or other knowledgeable people here could tell us about it. Thank you, Dave.

Comments are closed.