Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz


archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file. after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.

High Level:

  • use ffmpeg to turn any video to “rawvideo”.
  • pipe its output to *another* ffmpeg to turn the video to “yuv4mpegpipe”.
  • pipe its output to the libtheora tool.
  • for videos with audio, ffmpeg create a vorbis audio .ogg file.
  • add tasty metadata (with liboggz utils).
  • combine the video and audio ogg files to an .ogv output!

Detailed example:

  • ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo – |  ffmpeg -an -f rawvideo -s 400×300 -r 20.00 -i – -f yuv4mpegpipe – |  libtheora-1.0/lt-encoder_example –video-rate-target 512k – -o tmp.ogv
  • ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg
  • oggz-comment audio.ogg -o audio2.ogg TITLE=”Cape Cod Marsh” ARTIST=”Tracey Jaquith” LICENSE=”http://creativecommons.org/licenses/publicdomain/” DATE=”2004″ ORGANIZATION=”Dumb Bunny Productions” LOCATION=http://www.archive.org/details/CapeCodMarsh
  • oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv

WTFs:

  • Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
  • We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
  • We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
  • We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)

So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:

<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>

This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org.

-tracey jaquith

18 thoughts on “Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz

  1. j

    you should not use the ffmpeg vorbis encoder, it is really bad quality,
    please use libvorbis. you can do this by changing your line to:
    ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg

  2. Maik Merten

    Actually with a libvorbis encoder one could encode the audio track with like 80 kbit/s and give the spare bitrate to the video stream.

    I’d go for encoding Vorbis with the oggenc tool, not ffmpeg (which may have to be built with special options to allow encoding with libvorbis) and first decode to .wav (or pipe raw PCM samples)

    oggenc –resample 44100 -b 80 audiodump.wav audio.ogg

  3. Pete D.

    It’s really great to see Ogg Theora videos being widely used in archive.org. Thanks for these efforts!

  4. somebody

    This doesn’t work for me, it just throws an empty tmp.ogv .. it seems that the pipes aren’t working because if I ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo OUTPUT .. it works and gives a huge file. but I don’t have the space to do each one separately. that’s on ubuntu hardy. any ideas on how to make the pipes work?

  5. Gregory Maxwell

    Please don’t use the Vorbis encoder included in FFMPEG. It produces very low quality, even at high bitrates.

    It is hard for me to express in words how much worse the ffmpeg Vorbis streams sound. So, I’ve put up some 11 second examples: With my test file when you ask ffmpeg for 128kbit/sec you get this 64kbit/sec result. At a comparable output bitrate Xiph.Org libVorbis gives this result and even at 45kbit/sec libVorbis simply sounds much better. … and Xiph.Org libVorbis isn’t even currently the best encoder available for these bitrates.

    After listening I’m sure you can see that this is not just a nit-picking difference. The ffmpeg output simply sounds *bad*. It’s not not something which should be associated with the Vorbis name, and it’s not the quality that the public already expects from Vorbis. (if you have trouble playing the FFmpeg produced Ogg— this may be because the file is also not spec compliant, though it played on everything I had available to me)

    I’m unsure why ffmpeg is not shipping one of the liberally (BSD) licensed encoders. The Xiph.Org reference encoder in libvorbis would be an acceptable and obvious choice. (Although AoTuV would likely be a better choice, the difference is small compared to the output of the FFMPEG encoder). I don’t think most people in the Vorbis world were even aware of the FFMPEG encoder until it was noticed how poor the archive.org files sounded, as most people producing Theora files are using ffmpeg2theora which makes use of libVorbis to encode Vorbis audio.

    The above processing changes could be probably be amended to have the ffmpeg audio step output PCM, and pipe that into oggenc. Alternatively it may be possible to get ffmpeg to use libvorbis, as ffmpeg2theora does.

    1. pookito

      Hum, that is something to think about. I thought ffmpeg was good. I just finished posting a blog about it. I never new that the ffmpeg theora/vorbis output.file was less quality than the ffmpeg2theora. I am going to check that out. What I have noticed is that ffmpeg2theora takes a long time, I mean really long time to do the same job that ffmpeg does.

  6. Pingback: ProCasts Blog » Converting screencasts to Ogg Theora (.ogv)

  7. internetarchive

    yes, ffmpeg on newer ubuntu linux distros, can be setup to use libvorbis. so we are doing that now. we weren’t using “-acodec vorbis” previously just to showcase poor quality audio, but simply because in our 18-month old OS, that alternative simply wasn’t there.

    we updated to newer ffmpeg and “libvorbis” just before tax day 2009.

  8. Gregory Maxwell

    Not sure why it would be an issue of age: Libvorbis support in ffmpeg is much older than their built in encoder.

    In any case— fantastic news!

  9. George Chriss

    Just ran into the same problem:
    “We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)”

    This is true with oggz-comment, and it still passes validation. Yikes!

  10. Maple

    1 multi-channel capabilities to have fantastic sound
    inside your home. Listening to music and watching
    movies with life-like sound, are two of the best entertainment possibilities open today.

    But apart from that, you’ll find several other reasons why you should have a multichannel surround
    sound system installed on your own home entertainment.

Comments are closed.