archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file. after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.
High Level:
- use ffmpeg to turn any video to “rawvideo”.
- pipe its output to *another* ffmpeg to turn the video to “yuv4mpegpipe”.
- pipe its output to the libtheora tool.
- for videos with audio, ffmpeg create a vorbis audio .ogg file.
- add tasty metadata (with liboggz utils).
- combine the video and audio ogg files to an .ogv output!
Detailed example:
- ffmpeg -an -deinterlace -s 400×300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo – | ffmpeg -an -f rawvideo -s 400×300 -r 20.00 -i – -f yuv4mpegpipe – | libtheora-1.0/lt-encoder_example –video-rate-target 512k – -o tmp.ogv
- ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg
- oggz-comment audio.ogg -o audio2.ogg TITLE=”Cape Cod Marsh” ARTIST=”Tracey Jaquith” LICENSE=”http://creativecommons.org/licenses/publicdomain/” DATE=”2004″ ORGANIZATION=”Dumb Bunny Productions” LOCATION=http://www.archive.org/details/CapeCodMarsh
- oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv
WTFs:
- Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
- We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
- We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
- We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)
So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:
<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>
This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org.
-tracey jaquith