(Original, February 2005)
(Updated, August 2006, new script keeps audio/video sync and creates VOBSUB subtitle files)
(Updated, August 2007, patch contributed from Benjamin Pineau for latest versions of mplayer and mkvmerge)

1. Introduction

The issue of taking DVD backups has been a matter of controversy, over both legal and technical issues. I have been monitoring the latter kind (the technical) ever since I was a student. Some of my free time is still spent in following the technical innovations in the area, trying to keep up with the best possible way of transcoding MPEG2 (a.k.a DVD), in terms of balancing quality of the output with speed of transcoding.

For the last couple of years, I had settled on using:

...through the usage of excellent open source tools (transcode, oggenc, mkvmerge). I even contributed some code to the transcode project for handling NTSC telecined DVDs.
The reasons for these choices were simple: Recently, however, one of the parameters changed: MPEG4 has now been obsoleted in terms of quality/bitrate by a new codec: H264.
To give you an idea of what can be accomplished, utilizing open source implementations, one can now encode 2,5 hours of acceptable audiovisual quality in just one 700MB CD!

I found some spare time, so I'll show you how to do it -- the relevant information up to now is kinda sketchy (to put it mildly :‑) However, I will only describe the crux of it, and no, please don't start asking me stuff. This is only for people who have a working knowledge of both UNIX and video matters, so if you don't know what an aspect ratio is, or why 0.2 bits per pixel is the lowest limit of acceptable MPEG4 quality, stop reading this now (while you're ahead :‑).
Sorry, but making a living writing software takes almost all of my time.

P.S. You 'll also get a Perl script to assist the work.

2. Prerequisites

I'm only doing this as a hobby, so I won't research into any other versions of the tools: use the same version that I am, or adapt the Perl script yourself.

3. Encoding

Even though I have been using 'transcode' for most of my hobby tests, it doesn't support H264 (not yet). Hence, we are forgetting about distributed encodes (for now :‑) and are going to do it through the significant other, mencoder.

We'll tackle a rather advanced example, transcoding an NTSC telecined Dolby trailer, called dolby-city.vob. You can google the filename, it will immediately show up. Its rather big for plain modems though (27MB), so you can try to follow through these instructions with your preferred VOB anyway.

Naturally, if you want to encode DVD data, you'll have to store the unencrypted MPEG2 data somewhere; I'm assuming you know how to circumvent the CSS encryption to get to the MPEG2 data you want. mplayer -dumpstream dvd://1 can lead to legal troubles in certain countries, so make sure you are not breaking any laws doing it...

Most of what follows is automated through a Perl script I made, but since you are reading this I'm guessing you want to know the details :‑)
Click here to skip explanations and just use the script.

Let's assume that your MPEG2 data are stored in a directory called video:

heaven:/var/tmp/video$ ls -l
-rw-r--r-- 2 ttsiod users 27963392 2005-02-07 19:44 dolby-city.vob

3.1. Audio

We 'll tackle the audio first: mplayer will decode the audio and convert it to plain stereo, feeding the output to a pipe:
heaven:/var/tmp/video$ mkdir tmp
heaven:/var/tmp/video$ mkfifo tmp/fifo
heaven:/var/tmp/video$ tcscan -x pcm -i tmp/fifo &
heaven:/var/tmp/video$ mplayer \
    -ao pcm:nowaveheader:file=tmp/fifo \
    -vo null -vc dummy -benchmark dolby-city.vob \
    >/dev/null 2>&1 
This will output something like:
.. audio frames=849.60, estimated clip length=33.98 seconds
.. (min/max) amplitude=(-0.252/0.274), suggested rescale=3.655
tcscan provides the normalization factor we need for our audio encoding process: usually, DVD audio has quite a dynamic range, so it needs some boosting for our transcoding. Now that tcscan has given us what we need, we'll encode the audio, normalizing it in the process.

Since a volume rescale of 3.655 is 20*log10(3.655) db, we...

heaven:/var/tmp/video$ echo '20*l(3.655)/l(10)' | bc -l
heaven:/var/tmp/video$ oggenc -b 64 -o audio.ogg tmp/fifo &
heaven:/var/tmp/video$ mplayer -af volume=11.257747 -ao \
    pcm:file=tmp/fifo -vo null \
    -vc dummy -benchmark dolby-city.vob >/dev/null 2>&1 
(you can also try "-q 0" instead of "-b 64" - Vorbis audio is quite good even in 64kbps, and we'll use just that in this test).

The audio part is now done (in a more or less optimal way, in terms of quality/bitrate).
Now the hard part: video.

3.2. Video

Video can be a pain in the neck. Before we even get to the encoding part, we need to clear it up.

You'll have to deal with this, setting up a filter chain in mplayer.

3.2.1. Cropping

Use cropdetect:
heaven:/var/tmp/video$ mplayer -vf cropdetect -nosound \
Navigate in the movie through the cursor/PgUp/PgDn keys to make sure you've fed the filter all it needs to see from your movie. In the end, abort mplayer and check the output for the last of the "crop area" lines:
crop area: X: 4..715  Y: 0..479  (-vf crop=704:464:8:8)
Crop settings are now known: -vf crop=704:464:8:8. Test them:
heaven:/var/tmp/video$ mplayer -vf crop=704:464:8:8 \
	-nosound dolby-city.vob

3.2.2. Interlacing

Interlacing is a different beast. Fire mplayer again, navigate to a part in the movie with lots of action, and hit the DOT key (.). This will pause the movie, and each time you hit it again, it will step exactly one frame. If the frames you see this way are clean, you don't need any deinterlacer ; if they appear to be "combed", you do. I could rant about the way to deinterlace for hours, but basically, you'll either be content with something as simple as
heaven:/var/tmp/video$ mplayer -vf pp=lb ...
or, if its an NTSC telecined one, with
heaven:/var/tmp/video$ mplayer -vf detc -ofps 23.976 ...
The latter one is the one you need for the dolby-city trailer. It is a telecined beast, so it needs one heck of a filter to get back to progressive.

3.2.3. Scaling

Finally, we 'll probably have to scale the video. How do we decide whether we want to, or not?

Mencoder's documentation suggests against this; the authors feel that frame scaling is too much tampering with the original video, and that this is bad. They are over-reacting; we are doing this to squeeze more data in less storage, and doing this at the extreme level we want to simply REQUIRES scaling (unless we are targeting video bitrates more than 1MBit/sec, but then, why bother with H264 and not stick to MPEG4?)

To cut a very long story short, it was pointless to encode with MPEG4 - i.e. XVID, DIVX, or ffmpeg - to bitrates less than 0.2 bits per pixel. With H264, this changes: we can go lower, e.g. 0.125 bits per pixel and still get acceptable quality.

In dolby-city.vob, the original frame size is 720x480 pixels, at a (progressive, after deinterlacing) frame rate of 23.976 fps. This means that we would need at least

        720 x 480 x 23,976 x 0,125 = 1035763,2 bits per second
... if we were to avoid scaling. This bitrate is too high, we can't even fit a 2 hour movie in a 700MB CD with this rate.
Since our movie has a 4:3 aspect ratio, we can simply scale to a smaller window, like
        512 x 384 x 23,976 x 0,125 = 589234,176 bits per second
Notice that at this bitrate, we would be able to fit 2.5 hours of movie time in one 700MB CD, since
        Duration in seconds = 2,5*3600 = 9000 seconds
        Expected video size = 9000*589234,176/8 = 632 MBytes
        Expected audio size = 9000*64000/8  = 68 MBytes
...which would fit nicely in our 700MB target size (thanks to Matroska's near zero cost multiplexing).
Notice also that we chose multiples of 16 for our frame sizes: this is not a whim, it's a requirement of almost all codecs.

3.2.4. Using all filters

So, to actually do this, we'll use three mencoder filters:

-vf crop=704:464:8:8,detc,scale=512:384

The reason we are first cropping, then deinterlacing and finally scaling, should be obvious.
We can now complete the sequence with the H264 encoding parameters.

I won't bother explaining why you should always use two passes, just read the relevant info (or trust me):

Pass 1:

mencoder -nosound -ofps 23.976 \
	-vf crop=704:464:8:8,detc,scale=512:384 \
	dolby-city.vob \
	-o /dev/null -ovc x264 -x264encopts pass=1:bitrate=589
Pass 2:
mencoder -nosound -ofps 23.976 \
	-vf crop=704:464:8:8,detc,scale=512:384 \
	dolby-city.vob \
	-o video.avi -ovc x264 -x264encopts pass=2:bitrate=589
Check video.avi; it has no sound, but video quality at this bitrate is simply beyond any comparison with MPEG4.

4. Multiplexing

Finally, we'll have to create the Matroska container for video and audio:
mkvmerge --engage allow_avc_in_vfw_mode -o Perfect.mkv \
    video.avi audio.ogg
You might need to synchronize video and audio; check the manpage for mkvmerge and learn how to use the -y switch. Or better yet, use the script: it utilizes mencoder in a way that guarantees video and audio will be in sync.

5. Using the script

You might consider the previous steps tiresome.
You are right; they are; and if you make one mistake along the line, you could end up spending CPU time for unacceptable results.

That's why I coded a very simple Perl script that glues together all that you've seen.
Download it here and use it like this: (Update, August 2007: Thanks to Benjamin Pineau, if you are using the latest versions of mplayer and mkvmerge, you can download a patch here to support them)

heaven:/var/tmp/video$ vobs2mkv.pl dolby-city.vob 3 Perfect.mkv
...which requests an encoding of the dolby-city.vob movie, and a generation of file Perfect.mkv with a size around 3 MBytes.
Follow the prompts it shows; they should be self explanatory.
If they are not, hey, check the code! I only use simple aspects of Perl, so you should be able to figure out what goes on (it works for me with all the files I tried).

It also attempts to rip any MPEG2 subtitles existing in the stream to VOBSUB files, thus allowing a "perfect" rip; optimal video/audio/subtitle encoding.

This is the output (including the answers I gave) for 'dolby-city.vob':

Successfully located mplayer
Successfully located mencoder
Successfully located tcscan
Successfully located mpegdemux
Successfully located oggenc
Successfully located ogginfo
Successfully located mkvmerge
Available audio channels:
  1. 128
Automatically choosing audio channel 128
What codec should I utilize:
 1. XVID
 2. X264
Choose: 2
Identified Video stream successfully
Identified AID 0x80 successfully

Will now spawn mplayer to detect subtitle streams...
Navigate with PgUp/PgDown to movie parts with subtitles...
Hit ENTER when ready... and ESC to quit movie playback...

No subtitles detected.
Detected movie length of 35 seconds.
Will now spawn mplayer to detect crop settings...
Use the DOT key (.) to check for interlacing also...
Hit ENTER when ready...
~/.mplayer/subfont.ttf doesn't look like a font description, 
Cannot load font: /root/.mplayer/subfont.ttf
The selected video_out device is incompatible with this codec.
Try adding the scale filter, e.g. -vf spp,scale instead 
of -vf spp.
Do you want the codec to encode as interlaced (Y/N) ? n
Do you need NTSC inverse telecine (Y/N) ? y
Expected Video bitrate: 620000 bits per sec
MPEG2 Aspect Ratio:  1.33
FPS: 23.976
1. 208 x 160 (0.777019006185673)
2. 224 x 176 (0.655925135091802)
15. 496 x 384 (0.135769450005561)
16. 528 x 400 (0.12243935855047)
Choose a number (1 - 18) : 16
Will now spawn mplayer so that you can check your frame settings
Press ENTER when ready...
Did you like your frame settings (Y/N) ? y
Scaning to find amplification factor... Please wait...
Audio will be scaled by 3.655 (11.2577 db).
Encoding to Ogg Vorbis 96KBits/sec...
After a couple of minutes, you'll get your Perfect.mkv.

You can have a look at what happened from tmp/log.txt:

heaven:/var/tmp/video$ cat tmp/log.txt
03:07:13 : mplayer -v -frames 0 dolby-city.vob 2>&1 |
03:07:15 : Encoding session starts
03:07:15 : dd if=dolby-city.vob bs=1M count=2 2>/dev/null | \
	mpegdemux -c|
03:07:16 : mplayer -sid 0 -v -quiet dolby-city.vob 2>/dev/null |
03:07:16 : mencoder -ovc copy   -nosound -o /dev/null \
	-frameno-file /dev/null \
     	   dolby-city.vob 2>/dev/null |
03:07:17 : mplayer -nosound -benchmark -vf cropdetect \
	-quiet dolby-city.vob |
03:07:26 : mplayer -nosound -frames 10 dolby-city.vob 2>&1 |
03:07:47 : mplayer -nosound -really-quiet  \
	-vf crop=704:480:8:0,detc,scale=528:400 \
     	   dolby-city.vob >/dev/null 2>&1
03:07:48 : mv tmp/videoCropDeinterAndScale \
03:07:48 : rm -f tmp/fifo
03:07:48 : mkfifo tmp/fifo
03:07:48 : mplayer -really-quiet -aid 128 \
	-ao pcm:file=tmp/fifo:nowaveheader -vo null \
     	 c dummy dolby-city.vob >/dev/null 2>&1 &
03:07:48 : tcscan -i tmp/fifo -x pcm 2>/dev/null |
03:07:49 : mplayer -really-quiet -aid 128 -af volume=11.2577 \
     	 o pcm:file=tmp/fifo -vo null -vc dummy dolby-city.vob \
     	 dev/null 2>&1 &
03:07:49 : oggenc -s 123 -Q -b 96 -o tmp/audio.ogg tmp/fifo
03:07:52 : mencoder -ovc frameno -oac pcm -aid 128 -o \
	frameno.avi dolby-city.vob
03:07:53 : ogginfo tmp/audio.ogg|
03:07:53 : mencoder -ofps 23.976 \
	-vf crop=704:480:8:0,detc,scale=528:400 \
	-ovc x264 -oac copy \
	-x264encopts bitrate=654:pass=1:subq=6:\
	b_pyramid:weight_b \
	-o /dev/null dolby-city.vob

6. Comparison with MPEG4

Since the script allows you to select the codec used, try encoding your own videos around 0.125 bits per pixel with both MPEG4 and H.264. You won't be needing any screenshots for proof - your eyes will tell you that there's only one obstacle for widespread H.264 adoption: encoding speed. Currently, it is 2-5 times slower than MPEG4 encoding (depending on video type and CPU used). Let's hope x264 coders will improve the codec's speed over time.

profile for ttsiodras at Stack Overflow, Q&A for professional and enthusiast programmers
GitHub member ttsiodras
Updated: Mon Jan 9 22:01:46 2017