archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file. after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.

High Level:

use ffmpeg to turn any video to “rawvideo”.

pipe its output to another ffmpeg to turn the video to “yuv4mpegpipe”.

pipe its output to the libtheora tool.

for videos with audio, ffmpeg create a vorbis audio .ogg file.

add tasty metadata (with liboggz utils).

combine the video and audio ogg files to an .ogv output!

Detailed example:

ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo – | ffmpeg -an -f rawvideo -s 400×300 -r 20.00 -i – -f yuv4mpegpipe – | libtheora-1.0/lt-encoder_example –video-rate-target 512k – -o tmp.ogv
ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg
oggz-comment audio.ogg -o audio2.ogg TITLE=”Cape Cod Marsh” ARTIST=”Tracey Jaquith” LICENSE=”http://creativecommons.org/licenses/publicdomain/” DATE=”2004″ ORGANIZATION=”Dumb Bunny Productions” LOCATION=http://www.archive.org/details/CapeCodMarsh
oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv

WTFs:

Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)

So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:

<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>

This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org.

-tracey jaquith

18 thoughts on “Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz”

j December 9, 2008 at 8:09 am

you should not use the ffmpeg vorbis encoder, it is really bad quality,
please use libvorbis. you can do this by changing your line to:
ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg

Maik Merten December 9, 2008 at 10:22 am

Actually with a libvorbis encoder one could encode the audio track with like 80 kbit/s and give the spare bitrate to the video stream.

I’d go for encoding Vorbis with the oggenc tool, not ffmpeg (which may have to be built with special options to allow encoding with libvorbis) and first decode to .wav (or pipe raw PCM samples)

oggenc –resample 44100 -b 80 audiodump.wav audio.ogg

Pete D. December 15, 2008 at 3:53 pm

It’s really great to see Ogg Theora videos being widely used in archive.org. Thanks for these efforts!

Doktor Bro December 21, 2008 at 10:20 am

Can’t wait for see Ogg Theora replace the flash videos. Thank you!

pookito December 16, 2009 at 6:32 pm

dude, you and I both, you and I both

Edwin January 28, 2009 at 8:59 pm

Thanks for your info.
But, is the output video worse in quality?

somebody February 3, 2009 at 5:03 am

This doesn’t work for me, it just throws an empty tmp.ogv .. it seems that the pipes aren’t working because if I ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo OUTPUT .. it works and gives a huge file. but I don’t have the space to do each one separately. that’s on ubuntu hardy. any ideas on how to make the pipes work?

Cat Kutay November 22, 2011 at 6:26 am

Try replacing ‘–’ with ‘-‘
and put two in front of –video-rate-target etc

Gregory Maxwell February 7, 2009 at 12:06 pm

Please don’t use the Vorbis encoder included in FFMPEG. It produces very low quality, even at high bitrates.

It is hard for me to express in words how much worse the ffmpeg Vorbis streams sound. So, I’ve put up some 11 second examples: With my test file when you ask ffmpeg for 128kbit/sec you get this 64kbit/sec result. At a comparable output bitrate Xiph.Org libVorbis gives this result and even at 45kbit/sec libVorbis simply sounds much better. … and Xiph.Org libVorbis isn’t even currently the best encoder available for these bitrates.

After listening I’m sure you can see that this is not just a nit-picking difference. The ffmpeg output simply sounds *bad*. It’s not not something which should be associated with the Vorbis name, and it’s not the quality that the public already expects from Vorbis. (if you have trouble playing the FFmpeg produced Ogg— this may be because the file is also not spec compliant, though it played on everything I had available to me)

I’m unsure why ffmpeg is not shipping one of the liberally (BSD) licensed encoders. The Xiph.Org reference encoder in libvorbis would be an acceptable and obvious choice. (Although AoTuV would likely be a better choice, the difference is small compared to the output of the FFMPEG encoder). I don’t think most people in the Vorbis world were even aware of the FFMPEG encoder until it was noticed how poor the archive.org files sounded, as most people producing Theora files are using ffmpeg2theora which makes use of libVorbis to encode Vorbis audio.

The above processing changes could be probably be amended to have the ffmpeg audio step output PCM, and pipe that into oggenc. Alternatively it may be possible to get ffmpeg to use libvorbis, as ffmpeg2theora does.

pookito December 16, 2009 at 6:43 pm

Hum, that is something to think about. I thought ffmpeg was good. I just finished posting a blog about it. I never new that the ffmpeg theora/vorbis output.file was less quality than the ffmpeg2theora. I am going to check that out. What I have noticed is that ffmpeg2theora takes a long time, I mean really long time to do the same job that ffmpeg does.

Pingback: ProCasts Blog » Converting screencasts to Ogg Theora (.ogv)

internetarchive April 15, 2009 at 1:49 pm

yes, ffmpeg on newer ubuntu linux distros, can be setup to use libvorbis. so we are doing that now. we weren’t using “-acodec vorbis” previously just to showcase poor quality audio, but simply because in our 18-month old OS, that alternative simply wasn’t there.

we updated to newer ffmpeg and “libvorbis” just before tax day 2009.

Gregory Maxwell April 15, 2009 at 3:01 pm

Not sure why it would be an issue of age: Libvorbis support in ffmpeg is much older than their built in encoder.

In any case— fantastic news!

Older Women July 7, 2010 at 2:59 pm

These look like some great tools. I’ll have to check them out.

iraqchooseslife October 1, 2010 at 8:32 pm

Thank you
Accept traffic
Regards

George Chriss February 22, 2011 at 10:56 pm

Just ran into the same problem:
“We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)”

This is true with oggz-comment, and it still passes validation. Yikes!

traceypooh November 22, 2011 at 10:44 pm

Internet Archive Blogs

A blog from the team at archive.org

Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz

High Level:

Detailed example:

WTFs:

18 thoughts on “Fast and reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz”

Upcoming Events

Book Talk: The Secret Life of Data

Book Talk: Big Fiction