DCDi

An overview of DCDi®
By
Stacey Spears and Brian Florian
Finally! Faroudja technology
is now available at "affordable"
prices for the masses in consumer electronics. How do you identify
products with this technology? Simply look for the DCDi® logo. In some rare cases, Faroudja technology might be inside even when there is no logo to be found.
There are several products on the market today, from DVD players to display
devices, that contain this Emmy® award winning technology. What was once only
available in products costing $15,000 or more is now in multiple products well
below $1,000.
On the
design side, DCDi is a specific
technology Faroudja introduced a couple of years back in their broadcast
up-converter. Today the DCDi logo is really used to identify a superset of
Faroudja technology. This includes their patented film mode detection, bad-edit
detection, cross-color suppressor, and DCDi.
What is DCDi and why do you or I care? Before
we can explain that, some background information must be provided. Brian Florian will
show you how film is transferred to video, which is really the heart of it all.
|
An Explanation of Film-to-Video
Frame Rate Conversion for NTSC
To better understand the upcoming concepts, one
must be armed with some basic knowledge of how film gets transferred to
video, as well as the nature of interlaced versus progressive display. As
such, the following information is not intended to be a definitive paper on
the subject, but should serve as a good introduction for all.
The visuals and animations presented here,
though large in file size, are key and will reward repeat
viewing.
Motion pictures are comprised not of motion at
all, but numerous stills shown in rapid succession. For the films we all
watch at the theater, 24 frames are shown in one second (24 frames per
second, or 24fps). The NTSC television system differs from film in this
regard, making it complicated to show film on video.
Televisions create their image by
drawing (scanning) lines of light on the CRT face, left to right, top to
bottom, to produce a picture over the entire screen. The resultant images
that make up the motion picture are comprised of two interlaced fields: that
is, the first field consists of all the odd lines (1 through 525), and the
second field consists of all the even lines (2 through 524). The result is
that only half of the video's display is drawn every 60th of a second. A
simulation of this is shown on the left. Field 1 is scanned, and then Field
2 is scanned. Traditional talk quotes NTSC television as having 30 frames
per second (as opposed to film's 24), each being comprised of two interlaced
fields. This is actually misleading: The NTSC interlaced system shows 60
unique images per second, but each one uses only half of the vertical
resolution available on the display. Only if the source material contained
30 unique frames per second could you say that two fields form a
single frame but in reality, video material such as the evening news is true
60 fields per second. So we don't want to think of interlaced televisions in
terms of frames but rather in terms of fields, interlaced fields, and 60 of
them per second.
The principal drawbacks of an interlaced display
are (A) visible line structure, (B) flicker caused by the rapid alternating
of the fields, and most important, (C) artifacts such as 'feathering'
(also referred to as 'combing') and 'line twitter'. Visual artifacts like
these last two occur anytime the subject or the camera is in a different
position from field to field. The subject will be in one position for
one field, and in another position for the next, resulting in jagged edges
(feathering) or shimmering horizontal lines (twitter).
The animation on the right shows an example of
an interlaced display trying to show a tomato moving from left to right.
Each field shows the tomato a little farther to the
right than the previous. Because the fields are interlaced, jagged vertical
edges can't help but exist, except during for the last two fields (5 and 6)
where the tomato is stationary. The further back you are from an interlaced
display (or the smaller the display is), the less this and other artifacts
are noticed. If you want to see the effect in real life, just stick your
nose up to an interlaced TV. Focus in on an objects edge that is stationary
and wait for it to move. You will notice the problem right away.
At left is an interlaced image of a skier. Not
only is the flicker annoying, but have a good look at the ski-pole: It comes
and goes because its so fine it can only be found in one of the two
interlaced fields. This is line twitter. This artifact manifests it self
when fine detail is less than 2 scan lines high. It is exasperated during
vertical movement as the fields alternate. Often fine detail is filtered
before being encoded to minimize these artifacts when played back at home on
your interlaced display device. Because of this, we have yet to experience
the full potential of DVD.
The preceding basic knowledge of interlacing is
necessary to understand the transfer of film to video, because it is an
important factor in what we end up seeing.
Motion picture photography is based on 24 frames
per second. Time to call to mind all that math you learned in school and
realize that 24 doesn't go into 60 very easily. To boil it down a little,
our challenge is to make 4 frames from the film fit as evenly as possible
across 10 video fields. We can't just double up the fields on every fourth
film frame or we'd get a real 'stuttered' look. Instead, a process is used
known as 3-2 pulldown to create 10 video fields from 4 film frames. This
form of telecine alternates between creating 3 fields from a film frame and
2 fields from a film frame. Hence the name 3-2.
Consider now our flow chart of the 3-2 pulldown
performed on four frames of this movie scene:


Pretty cool right? It is and it isn't. 3-2
pulldown inherits much of the artifacts we described when talking about
interlaced video. Anytime a field follows one made from a different film
frame (noted above by the "!" icon), there exist the possibility for
anomalies in what we see, feathering and twittering being great examples.
Absolutely any differences between the two film frames that make up the
video frame (the last field of one frame and the first field of the next
frame), be it brightness, color, or especially motion, are going to result
in some artifact as the two fields merge on screen. Even our little animated
synthesis of the final interlaced product, which actually contains 10
interlaced pieces, shows evidence of such anomalies as the flying police
cars move ahead. Such is life.
As long as you are watching your movies on an
ordinary interlaced display, there is not much more to tell you. What you
see at home is pretty much what we've shown as the interlaced content in the
above illustration. But should you have the fortune to be using a
progressive display TV, the following comes into play.
Progressive displays, such as high-performance
CRT/LCD/DLP/D-iLA projectors and the new HDTV-ready TVs, can show
progressive scanned images as opposed to interlaced. In order to do this,
the display must scan at a higher rate, 2x the speed of NTSC. Because we are
scanning at twice the speed, we can draw an entire frame in the same amount
of time it takes an interlaced system to draw a single field. We learned
above that an interlaced display shows 60 fields per second. But with
progressive, each "field" is now a complete picture including all scan
lines, top to bottom, so we will now call it a frame, and we are showing 60
of those per second. (Of course, only 24 of those are unique if the source
is film based) The benefits of a progressive display are no flicker, scan
lines are much less visible (permitting closer seating to the display), and
they have none of the artifacts we described for the interlaced display, as
long as the source material is progressive in nature (film or a progressive
video camera).
But sources which are truly progressive in
nature are hard to come by right now. Movies on DVD are almost always
decoded as interlaced fields yet all of the film's original frames
are there, just broken up. What we're going to talk about next is how we
take the interlaced content of DVD and recreate the full film frames so we
can display them progressively. The term commonly used to restore the
progressive image is deinterlacing, though we think it is more correct to
call it re-interleaving, which is a subset of deinterlacing.
Deinterlacing (or re-interleaving) involves
assembling pairs of interlaced fields into one progressive frame (1/60 of a
second long), and showing it at least twice to use up the same amount of
time as two fields. The need for 60 flashes on the screen each second stems
from a biological property called the Flicker Fusion Frequency, meaning how
many flashes that we need to see each second so that we (our brains) fuse
the image into one where we don't see a flicker.
For every film frame that had three fields made
from it, the third field is a duplicate of the first, and (if the MPEG-2
encoder is behaving properly) won't even be stored on the DVD. Instead of
encoding the duplicate fields, the DVD flags repeat_first_field and
top_field_first are used to instruct the MPEG decoder where to place these
duplicate fields during playback.
The progressive output of a DVD player should
assemble 2 fields from each film frame and create a complete progressive one
that looks just like the original film frame. You should now be thinking
that the DVD will once again have 24 frames to show in one second. But the
progressive display is still expecting 60 complete frames per second. In
order to space them out, the DVD player shows the complete frames in this
order: 1, 1, 1, 2, 2, 3, 3, 3, 4, 4 and so on.


This form of display gives us a moving image
very close to the original film. It has a tendency to "judder" a bit though,
as every other film frame lasts 1/60 of a second longer than the previous
one. Even our little synthesis of the final product, which actually contains
10 pieces, shows this judder. In the future, both the player and the display
could increase their display rate above 60 fields per second, to 72 per
second. At that point, the fields would only last 1/72 of a second,
permitting the player to show every film frame three times (24 x 3 = 72),
eliminating the motion judder, and also helping us with the Flicker Fusion
Frequency problem (60 flashes per second are just barely enough in a well
lit viewing environment). This would look like: 1, 1, 1, 2, 2, 2, 3,
3, 3, 4, 4, 4 and so on. 72 fps will only work with film based sources
though, as it is a multiple of 24. It will not work well with video sources
which are 60 field per second.
The re-interleaving process we've just covered
is specific to 24fps film material which is MPEG-2 decoded (as interlaced
fields). It's really a matter of putting the right fields together so it's
fairly simple. Deinterlacing native NTSC interlaced video material is much
more complicated. In such video material, each field is a unique image in
time, and in order to be deinterlaced at an acceptable level, it requires
getting into motion-adaptive and motion-compensation algorithms to overcome
the inherent problems of the interlaced material. There is no best method,
and the two mentioned are expensive to implement.
(Note: NTSC does not really run at 60 Hz; it is
technically 59.94 Hz. The industry rounds it up to make it easier to read.
If you did play back video at 60 Hz instead of 59.94 Hz, you would end up
with a dropped frame approximately once every 20 seconds.)
- Brian Florian -
|
Background
Now that you have a basic understanding of how film is
transferred to video, a few more terms need to be explained before we move on.
We
like to use the term deinterlacing, while others may call it line doubling or
even I to P conversion. (Where I means interlaced and P means
progressive). All of these really mean the same thing. We will also talk about
vertical resolution. As Brian explained above, there are 525 horizontal scan lines
(see diagram below) that make
up an NTSC image. These horizontal lines are how we measure the vertical
resolution.

We will be using the terms video mode and film mode to
describe the type of deinterlacing algorithm used. When we say Film mode, we are to the algorithm that will detect the 3-2 pulldown cadence and
weave the two fields of video into one that would match the original frame of
film. When this is done, you are using all 525 original lines, which gives you
the full vertical resolution of the image. This means even scenes that contain
fast motion will be displayed in full resolution.
When we say Video mode,
we are referring to the algorithm that
uses interpolation to create a full 525 line image. The most basic form of video
mode is where one field is used. If the field that we currently have is the one
that contains the odd number of lines like 1, 3, 5, etc., then interpolation will
be used to create the even lines like 2, 4, 6, etc. To create line 2, we would
perform some type of average of lines 1 and 3. When only interpolation is used,
you no longer have the full vertical resolution. In scenes where fast motion
exists, you lose half of the vertical resolution. A more advanced form of
video mode is called motion-adaptive, which is what Faroudja employs. This
algorithm will not only interpolate, but will also try and weave together two
fields, which would provide more vertical resolution. On a pixel-by-pixel basis,
this algorithm will weave together areas that are static (no motion) and
interpolate when there is motion.
We will break down the FLI2200 into its various parts.
These include film mode detection/bad edit detection, chroma processing
(cross-color suppression), and DCDi.
Film Mode / Bad Edit Detection
In 1989 Faroudja
invented and patented film
mode detection (also called 3-2 pulldown detection). Faroudja was the
only company who had the ability to detect the original frames of film within
the video stream and reconstruct an accurate image. This yielded an image that
was free of motion artifacts, and it contained the full vertical resolution of
the image.
In the early 90's,
a video magazine conducted a couple
of high profile video shootouts. They obtained all of the current line doublers,
as they were called back then, and put them head-to-head. One product stood out
from all the rest, the Faroudja LD-100. This was the only product that had film
mode detection, and rightly so, as Faroudja held the patent on it. This processor
was not just a little better than the other, it was a lot better.
A couple of years ago, other companies began to introduce
video products that also had film mode detection. This narrowed the
performance gap between Faroudja and the rest. However, like any great company,
Faroudja did not sit around reminiscing about the good old days, they continued
to invest in research and development, and DCDi was born. In fact, DCDi just won
Faroudja another Emmy®.
Just having film mode detection is no longer just good
enough. Why? Because the 3-2 pulldown cadence, like the world, is not perfect.
Problems occur which cause that cadence to break. You might get a 2-2 or 3-3, or
4-1 cadence to name just a few of the possibilities, and this can confuse some of
the other technology in use today. Enter bad edit detection. Errors can and do
happen at any stage between the time the images travel from the film to your TV
screen. These errors will show up as artifacts on screen. The most common
artifact, a comb, happens when the video processor combines two fields of video
that come from two different frames of film. Bad edit detection is able to see
this problem coming and avoid it all together. Figure 1 below shows an example
of what a comb would look like on-screen. Notice the picture looks like someone
ran a comb through it, thus the name.
|
 |
|
Figure 1 |
How is the problem avoided?
By switching between film
and video deinterlacing. If you can't combine two fields together because they
don't belong, the product has to interpolate. Hopefully this only happens
for a few fields of video. All deinterlacing algorithms switch between film and
video, but the strength lies in how quickly you can detect the error and switch.
Many deinterlacers switch after it is too late. The goal is to switch to video
mode before you display an artifact and switch back to film mode as quickly as
possible. Remember, video mode means it is dealing with a signal that originated
as video, with 60 fields (equivalent to 30 frames) per second. Film mode means the original was a film,
with 24 frames per second. The problem occurs because the final viewing medium
is a TV, with 60 fields per second, regardless of what the original source was.
Chroma Processing
The FLI2200 does a couple of different things to the
chroma, or color channel. Faroudja's most popular advertised feature is the
cross-color suppressor. This technology dates all the way back to the LD-100.
It was developed for composite sources (composite video), where the luma (black and white, or
detail) and chroma information are combined into one channel. Once combined
there is no perfect way to separate the chroma from the luma, so some of that luma information
will leak into the color channel and introduce cross-luma or dot crawl (see
diagram below), and some
of that chroma will leak into the luma channel and cause cross-color.
Dot crawl is seen at the boundaries of contrasting colors, like blue and yellow.
It looks like a moving escalator. Cross-color looks like little rainbows that show up around fine detail like in a
chain-link fence. DVDs are stored in the component format (component video),
with the chroma and luma separated to begin with, so in theory,
cross-color should not be an issue. However the source for some DVDs come from
composite masters. If you are fan of Japanese animation, you may love the
cross-color suppressor feature, because Anime most often comes from a composite master.

Another benefit of the chroma processor is the ability to
mask the chroma upsampling error found in a large portion of DVD players. This
artifact produces horizontal stripes in areas of color that should be smooth.
With Faroudja's FLI2200 chip inside, the chroma upsampling error is greatly
reduced.
DCDi
DCDi is a video mode algorithm that stands for Directional
Correlation Deinterlacing. It was designed for video based
material like fast-paced sporting events. Its
purpose is to eliminate jagged edges (jaggies) along diagonal lines caused by
interpolation. If you remember, you are not simply weaving together two fields
of video that match, you have to create new information through the art of
interpolation which is really a fancy way to say you are guessing. DCDi monitors
edge transitions and fills in the gaps. The technology was introduced a few
years back in the digital format translator, a $50,000 system that broadcasters
like CBS use to upconvert NTSC to HD. It was/is used to upconvert standard
definition material (480i, what we have on conventional TV) to enhanced
definition quality (480p). You may already being enjoying DCDi today on your
digital TV.
|
 |
|
Figure 2 |
In Figure 2 above, you will see
how DCDi makes the Stars and Stripes much more dramatic, and it is really a
terrific illustration of how powerful DCDi is. On the left is the original image
on a TV. The flag is blowing in the wind, and this is a very tough image
for a TV to show. On the right are enlargements of an area in the middle of the
picture. At the top, right, is that enlarged area of the flag, with DCDi turned
off. Notice the junctions of the red and white stripes. You can see jagged
lines. With DCDi turned on (bottom, right), the jagged lines are gone, and the
junctions between the red and white stripes are smooth. This is a huge
technical accomplishment by Faroudja engineers.
Because DCDi is a video algorithm
(an algorithm is a series of mathematical formulas), you might wonder how it
affects viewing a film on TV. Remember, in order to avoid artifacts, a video processor will switch
modes (film vs. video - video vs. film). If the transition between video and
film is not done properly by the studio, it is called a Bad Edit. The video processor will
then treat the film material as
video during those sections of bad edits. There are a couple of giveaways when
the processor has switched from
film mode to video mode. First is the loss of resolution. This is minimized
because the Faroudja algorithm is motion adaptive. The second is the appearance of jaggies along diagonal edges. DCDi hides a good portion of the jaggies, so you
never realize when it changes from and to film mode, which is the whole point!
DCDi makes the movie watching experience more enjoyable because the annoying
artifacts are all gone, so all you have to worry about is whether there is any
more microwave popcorn in the kitchen cabinet.
Implementing the Technology
The FLI2200 chip can be customized by the
engineers who are building the products, such as DVD players. By default, the chip has preset values
and as time passes, Sage will be adding features to the set of default values to
increase its flexibility.
The clear advantage of having the FLI2200
bolted to a DVD player is that the signal stays digital during video processing,
rather than processing it in the analog domain. It is much easier to find
matched fields when they are 100% bit-for-bit identical. If you
convert to analog and then back to digital, noise will be introduced. Since
noise is random, two matched fields may contain enough noise to make them look
slightly different. This makes the work of the deinterlacer harder and not as
reliable, but not impossible. In fact they have had to do it this way for many
years, so they know how to deal with noisy material.
The FLI2200 has a couple of implementation choices. There
is optional external memory that can be used, which does of course raise the cost. All
current FLI2200 implementations as of this writing use the optional memory. If
the memory is not used, then film mode detection is disabled and all deinterlacing is video mode. You would never want this in a DVD player, but
it
would be just fine in a display device (a projection TV). A display device must deal with all
kinds of sources, a large portion of it being video. A TV manufacture is always
trying to get the costs down because as consumers we like to spend as little as
possible. It would be perfectly acceptable to have the FLI2200 in a display
running in video mode only. You could then use a progressive DVD player for all
of your movies. Of course, this would mean other sources like laserdisc (LD) and
VHS would not look as good, but LD and VHS are going the way of the Dodo.
Further Reading
- Stacey Spears -
"The Fifth Element" Copyright 1997, Columbia
Pictures
"Casablanca" Copyright 1942, Warner Brothers
"Top Gun" Copyright, 1986,
"Galaxy Quest" Copyright
1999, Dreamworks
Stacey Spears
is Video Editor, and Brian Florian is Editor, Canada, for Secrets of Home
Theater and High Fidelity (http://www.hometheaterhifi.com)
|