Doi Camera, NY

Video Transfer Service, New York

Video Basics Made Easy

VHS to DVD Video to DVD HD tape to DVD mini DV to DVD
PAL to NTSC 8mm to DVD Video for iPad, iPhone Beta to DVD
Movie film to DVD Audio Cassette to CD
Slide Scan Photo Scan Film Scan Passport Photos
Home page Contact us    

 

Contents
1. What is Video?- definition of Video-
2. Properties of Video - terminology
3. Video quality - how to compare
4. What is "Digital Video"? - Analog vs. Digital video-
5. How to reduce Video size? - Smaller Frame size, less FPS and Frame compression.
6. Codec - Intra-frame & Inter-Frame compression
7. Container Formats - Codec vs. Container Format.

 

.

 

 

1. What is Video? -Definition of Video-

The English term "video" comes from the Latin verb "videre", which means, "I see".

Wikipedia defines "video" as follows:
Video is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion.

Difference between "Movie" and "Video"


  Movie Video
Purpose Big screen by projection Monitor
Frame One frame at time Dot(or Pixel)>Line>(Filed)>Frame
Big image Easy to achieve Resolution (pixel) limitation
Need electric? No Yes
     

 

(1) Video is a sequence of still images representing scenes in motion.
Basic idea is the same as old movie film, where the "sequence of still images representing scenes in motion" is captured on (analog) films, and the films need to be chemically developed before showing the image.

(2) Video is a motion technology of "electronically" done.
You might think that a Hollywood type movie film shooting uses a lot of electronics.
But in its basic technology, movie can be captured on a film with no batteries, flashlights or no electric at all.
There was a small handle to wind up films, but no battery in an old 8mm movie film camera.

(3) Video will be "capturing, recording, processing, storing, transmitting, and reconstructing" all done electrically.
-1. Capture -video camera (camcorder) needs battery vs. Film movie camera doesn't.
-2. Record - Mostly on magnetic tape ( e.g. VHS tape), hard-drive or some forms of card (e.g. SD card).
-3. Process - Tapes or other medium will be processed with electric devices.
-4. Store - same as the recording device (2).
-5. Transmit - TV broadcast, through the Internet, LAN or WiHi network (iPad, iPhone, iPod..)
-6. Reconstruct - Movie films don't need to be reconstructed.
Why Video have to be "re-constructed"?
Because video has to be shrunk to a small size to fit on a single disk (e.g. a DVD disk) or transmittable through the Internet that it has to be restored to its original size when viewed.
You receive TV program by cable, dish or phone line; and these signal(data) will be reconstructed as image (video) on your TV monitor. Or video can be transmitted through the Internet and will be reconstructed on your iPhone.
This technique is called Codecs (explained in details here).

Then why there was no such reconstruction idea with movie films?
Because the film can not be reduced physically so delivered by cars, trucks or FedEx to movie theaters.

 

2. Properties of Video

Digital video comprises a series of orthogonal digital images displayed in rapid succession at a constant rate.
In other word, the very basic idea of video is the same as 100 years ago - it's a Flip book, just a digital version

Let's learn terminology (words) about VIDEO, so we can talk each other smoothly.

Frame

Each page = image is called Frame

 

Pixel

The size of each frame is measured by digital unit = how many pixels in horizontal and vertical.(Width x Height).(e.g. 720 x 480 pixels).
Pixel
is short for Picture Element, the smallest addressable screen element; it is the smallest unit of picture that can be controlled. Graphics monitors display pictures by dividing the display screen into millions of pixels, arranged in rows and columns.

How big is each frame = Width x Height in Pixel numbers.

 

FPS

How many Frames are shown per second = Frames Per Second (FPS).

The minimum frame rate to achieve the illusion of a moving image is about 15 frames per second.
Movie film is shot at the frame rate of 24photograms/s, which complicates slightly the process of transferring a cinematic motion picture to video.

In video, NTSC (USA, Canada, Japan, etc.) is 29.97 frames pre second (FPS), while PAL (Europe, Asia, Australia, etc.) and SECAM (France, Russia, parts of Africa etc.) standards are 25 FPS.

  Old TV system HD TV system
Frame per second same (30 or 25)
Number of Vertical lines Fixed (525 or 650 lines) 720 or 1080
Horizontal resolution   1280 or 1920
Display ratio 4:3 16:9
Scanning system Interlace Interlace and Progressive
     



This frames per second formats apply to HDTV systems too, where same as old TV systems, NTSC HDTV is 60i(=30p) and PAL is 50i(24p).
However, unlike the old (=traditional) TV system where the vertical resolution, and display ratio are fixed, the HDTV broadcast systems are identified with three major parameters: 1)Frame size, 2)Vertical lines 3)Scanning system.

-1. Frame size in pixels is defined as number of horizontal pixels
-2. number of vertical pixels, for example 1280 ~ 720 or 1920 ~ 1080. Often the number of horizontal pixels is implied from context and is omitted, as in the case of 720p and 1080p.
-3. Scanning system is identified with the letter p for progressive scanning or i for interlaced scanning. E Frame rate is identified as number of video frames per second. For interlaced systems an alternative form of specifying number of fields per second is often used (60i = 60 fields = 30 frames per second).

If all three parameters are used, they are specified in the following forms:
a) frame size / scanning system / frame or field rate
or
b) frame size / frame or field rate / canning system.

But for commercial product naming, the frame rate is often dropped and is implied from context (e.g., a 1080i television set).

A frame rate can also be specified without a resolution.
For example, 24p means 24 progressive scan frames per second, and 50i means 25 interlaced frames per second.

In NTSC market, the1080i30 or 1080i60 notation identifies interlaced scanning format with 30 frames (60 fields) per second, each frame being 1,920 pixels wide and 1,080 pixels high.
The 720p60 notation identifies progressive scanning format with 60 frames per second, each frame being 720 pixels high; 1,280 pixels horizontally are implied. 50Hz systems allow for only three scanning rates: 25i, 25p and 50p. 60Hz systems operate with much wider set of frame rates: 23.976p, 24p, 29.97i/59.94i, 29.97p, 30p, 59.94p and 60p. In the standard definition television, the fractional rates were often rounded up to whole numbers, e.g. 23.976p was often called 24p, or 59.94i was often called 60i.

High definition television allows using both fractional and whole rates; therefore strict usage of notation is required.
Nevertheless, 29.97i/59.94i is almost universally called 60i, likewise 23.976p is called 24p.

 

Duration

How long is the movie = duration or runtime, normally in seconds.

 

Color Depth

How rich color = how many colors on each frame.
If there is no color = 1 bit = on or off, 256 colors ( = 8 bit color depth), 4096 colors( = 16 bit),
"True color" mode( = 24 bit) to even 32 bit. Windows 7 supports up to 48 bit color depth.

 

Bit Rate
BPS

bit rate, bit per second (BPS) = the number of bits that are conveyed or processed per second.
( BIT is short for binary digit, the smallest unit of information on a computer. The term was first used in 1946 by John Tukey, a leading statistician and adviser to five presidents. A single bit can hold only one of two values: 0 or 1. A byte is composed of 8 consecutive bits.)

BPS is measured by following formula:
BPS= Width x Height x color depth x FPS
(how much info per second = how big is each frame size x how many colors x how many frames per second)

Bit rate plays an important role when transmitting video because the transmission link (such as the Internet, TV cable ) must be capable of supporting that bit rate. Bit rate is also important when dealing with the storage of video (such as DVD disk, iPhone ) because, as explained above, the video size is proportional to the bit rate and the duration. Bit rate of uncompressed video is too high for most practical applications. Video compression is used to greatly reduce the bit rate.

 

Video Size Video size = how big is the digital video file size.

It can be measure by the following formula:
Video size = Width x Height x color depth x FPS(frame per second) x runtime in seconds.
(How big is each page x how rich color it contains x how many pages per second x how long )
Since BPS (bit per second ) is the amount of info per second, Video size can be measured by:
Video size = BPS x total runtime in seconds

For example, a video of duration (T) of 1 hour (3600sec), a frame size of 640x480 (WxH) at a color depth of 24bits and a frame rate of 25fps.
The pixels per frame = 640 * 480 = 307,200
Bits per frame = 307,200 * 24 = 7,372,800 = 7.37Mbits
Bit rate (BR) = 7.37 * 25 = 184.25Mbits/sec
Video size (VS) = 184Mbits/sec * 3600sec = 662,400Mbits = 82,800Mbytes = 82.8Gbytes
If this is a High Def video, the pixels per frame is 1920 x 1080 = 2,073,600 > almost 600 GB

 

Interlace
vs.
Progressive
1st Field
2nd Field
Frame you can see
+
=
1st 1/60 sec.
next 1/60 sec.
every 1/30 sec.

Movie film has a lot of frames. Each frame is a still photo. Through a big lens, each frame will be projected on a big screen
16mm film has a good resolution for a medium room and 35mm movie film is good for a regular movie theater size screen.

Human beings has not invented a technology that can display an "area" or a "frame" on monitor instantly.
Sound strange but it's true.
Instead, what we are doing is to use a small dot called Pixel (= Picture Elements) and draw lines, and many lines will make up a field (or frame). Same concept as we paint on a campus.
When we try to fill up an entire frame (= size of TV monitor) by one action, a dot (pixel) has to start running from the left top corner, goes to a right end, returns to one line below left end, goes to the right end again, and repeats for about 525 lines until finishes at the bottom of the right down corner.
It will be a long distance trip and take a lot of time to go through 525 lines in one action.

In the 1940s, when TV system was being developed, there was no such technology to achieve that kind of speed. What could be done?

(1) Interlace

Someone thought about braking a single frame into halves, so the beam has to run only a half distance.
This idea is called "Interlace Scanning system".

What basically thought was to divide a frame into 2 halves, called fields (one field = a half frame).
The first field is a half image of the original frame.
The second field is the rest of the half original frame.
And we display two fields very quickly (in a matter of 1/60 seconds).
B because of the effect of persistence of vision, for human-beings eye, we can see two fields as a full frame.
In this way, the beam(dot) can be run at half the speed, which was possible even back then.

This technique is called Interlace Scanning System.
Interlacing technique was invented as a way to achieve good visual quality within the limitations of a narrow bandwidth.

How to divide a frame was another problem.
It was not divided in halves vertically or horizontally.
The horizontal scan lines of each interlaced frame are numbered consecutively and partitioned into two fields:
a) the odd field - consisting of the odd-numbered lines, and
b) the even field - consisting of the even-numbered lines.

NTSC, PAL and SECAM are interlaced formats.
Abbreviated video resolution specifications often include an i to indicate interlacing.
For example, PAL video format is often specified as 576i50, where 576 indicates the vertical line resolution, i indicates interlacing, and 50 indicates 50 fields (half-frames) per second.

Simply put it, this is a fake technique, using our persistence of vision effect, we are looking at two halves all the time.
To cover up this illusion, a procedure known as deinterlacing can be used for converting an interlaced stream, such as analog, DVD, or satellite, to be processed by progressive scan devices, such as TFT TV-sets, projectors, and plasma panels. Deinterlacing cannot, however, produce a video quality that is equivalent to true progressive scan source material.

(2) Progressive

Interlace technique was developed because TV technology could not display an entire frame in one action.
In the 21st century, we have a technology which makes a beam(pixel) run so quick that it can cover an entire monitor in one action.

This technique is called Progressive Scanning System.
In progressive scan systems, each refresh period updates all of the scan lines.
The result is a higher spatial resolution and a lack of various artifacts that can make parts of a stationary picture appear to be moving or flashing.

720p is better than 720i, 1080p is better than 1080i HDTV system.

 

Aspect Ratio

Display device aspect ratio and Pixel shape aspect ratio

(1) Display Aspect Ratio (DAR)

The screen aspect ratio of a traditional television screen is 4:3, however, High definition televisions use an aspect ratio of 16:9.

(A) Traditional TV monitor is 4:3 ratio

The 4:3 ratio (generally named as "Four-Three", "Four-by-Three", "Four-to-Three", or "Academy Ratio") for standard television has been in use since television's origins.

Why 4:3?
Back in 1940 when TV program was planned to start, the monitor ratio can be almost anything from circle, square to panoramic (technically circle shape was easy to make, but it didn't appeal to general audience).
So why 4:3 ratio was set to TV monitor standard?
The reason was that TV station had to play movie films, as well as live performance.
The 4x3 ratio comes from the shape of movie film. If the TV stations broadcast nothing but live performance,
the broadcasting image shape can be almost anything.
But TV stations has to broadcast movies, too.

In 1940s, the only the recording device for motion picture was movie film (nothing else).
Since the image ratio on move film was 4:3, TV system had to adapt this ratio, so the full image of the movie can be shown on TV monitors.
In other word, by having TV monitor match the aspect ratio same as movie film, movies previously photographed on film could be satisfactorily viewed on TV.

Let's go back a little further.
Then, why movie film has 4:3 ratio?

It goes back to when Thomas Edison invented motion picture.
In motion picture formats, physical size of the film area between the sprocket perforations determines the image's size.
The universal standard (established by William Dickson and Thomas Edison in 1892) is a frame that is four perforations high.
The film itself is 35 mm wide (1.38 in), but the area between the perforations is 24.89 mm~18.67 mm, leaving the de facto ratio of 4:3.

Later in TV area, when cinema attendance dropped, Hollywood created wide screen aspect ratios (such as the 1.85:1 ratio mentioned earlier) in order to differentiate the film industry from TV.

(B) HD TV monitor ratio is 16:9

Why 16:9 ratio for HD TV?
No good reason why.
In 1990s, The Society of Motion Picture and Television Engineers (SMPTE) derided that 16:9 should be the international (meaning for both PAL and NTSC TV systems) standard aspect ratio.
There were many aspect ratios back then, so SMPTE compromised all these ratios and made 16:9 as the standard.
It was a more political decision than mathematical reasons.

 

(2) Pixel Aspect Ratio (PAR)

Pixel is short for Picture Element, and is the smallest element(unit) on a monitor.

Pixels on computer monitors are usually square, but pixels used in digital video often have non-square aspect ratios.
Therefore, an NTSC DV image of 720 pixels by 480 pixels (= 3:2 ratio) is displayed on a standard TV monitor( =4:3)
if the pixels are thin (longer in vertical).
The same 720 x 480 image will be displayed fine on a 16:9 HD monitor, if the pixels are fat in horizontal.

   

3. Video Quality

How to compare video quality?

How to tell Hi8 tape is "better" than VHS, or DVD is better than VHS?

(1) Digital Video

In digital video world, number of pixels rules. More pixels, better picture.
In real world, however, often that's not the case, but in theory, the numbers of pixel in a certain area is very important factor for video quality.
For example, on the same size of TV monitor, 1920 x 1080 pixels quality is better than 640 x 480 pixels image.

(2) Analog Video

In an old analog video world, the pixel doesn't play a role for judging video quality, because unlike digital video, analog video is not made of pixels.

So what does differentiate the analog video quality?
If you can see more details on monitor, that's a better image.
To get more details, monitor needs more dots, like pixels.
With digital HDTV world, more dots means more pixels. The number of pixels rules.
With analog TV, the number of horizontal lines (vertical resolution) is fixed (NTSC is 525, PAL is 625).
How can we differentiate the image quality? - the answer is horizontal resolution.


-1. Number of horizontal lines (= vertical resolution)

In an old TV system, the number of horizontal lines was fixed.
NTSC (=US) is 525 lines and PAL (=Europe) is 625 lines.
If we go from the word " vertical resolution", it is independent of the system bandwidth and defines the capability of the system to resolve horizontal lines, which is expressed as the number of distinct horizontal lines.

Unlike today's HD NTSC TV system whose horizontal lines could be 720 or 1080, in the old NTSC analog TV system, the number of horizontal lines is fixed to 525 lines. However, because complete loss of vertical resolution will occur when the scanning spot straddles picture details, from subjective data, it has been found that the actual vertical resolution is equal to 70 percent (the Kell factor) of the number of screen lines.
In NTSC, there is a total of 525 lines per frame, of which about 40 are blanked, leaving, typically, about 485 active lines per frame. Given a Kell factor of 0.7, the effective vertical resolution is about 340 lines (525-40 loss) x 0.7 = 339.
You might want to call the old NTSC TV system 340i vs. 1080p(today's HD TV).

Why NTSC is 525 and PAL is 650 lines?
Why not simpler 500 or 600 even numbers?
The figure of 525 lines was chosen as a consequence of the limitations of the vacuum-tube-based technologies of the day.
In early TV systems, a master voltage-controlled oscillator was run at twice the horizontal line frequency, and this frequency was divided down by the number of lines used (in this case 525) to give the field frequency This frequency was then compared with the 60 Hz power-line frequency and any discrepancy corrected by adjusting the frequency of the master oscillator.

For interlaced scanning, an odd number of lines per frame was required in order to make the vertical retrace distance identical for the odd and even fields; an extra odd line means that the same distance is covered in retracing from the final odd line to the first even line as from the final even line to the first odd line, so simplifying the retrace circuitry. The closest practical sequence to 500 was 3 ~ 5 ~ 5 ~ 7 = 525. Similarly, 625-line PAL-B/G and SECAM uses 5 ~ 5 ~ 5 ~ 5.


-2. Vertical resolutions = number of elements on a horizontal line

Because no beam runs vertically on a TV monitor, vertical resolutions are defined by the number of dots (=elements) on each horizontal line.

(3)Comparison of video formats

Below is a list of modern, digital-style resolutions (and traditional analog "TV lines per picture height" measurements) for various media.
The list only includes popular formats.
All values are approximate NTSC resolutions.

Analog formats
352~240 (250 lines at low-definition): Video CD
350~480 (250 lines): Umatic, Betamax, VHS, Video8
420~480 (300 lines): Super Betamax, Betacam (professional)
460~480 (330 lines): Analog Broadcast
590~480 (420 lines): LaserDisc, Super VHS, Hi8

Digital formats
720~480 (500 lines): DVD, miniDV, Digital8
720~480 (380 lines): Widescreen DVD
1280~720 (680 lines): Blu-ray
1440~1080 (760 lines): miniDV (high-def variant)
1920~1080 (1020 lines): Blu-ray

 

4. What is Digital Video?

 

 

 

5. How to reduce Video file size?

Now that we are familiar with digital video terminology, we can talk about digital video in real world.

Summery
1. Why digital video has to be small?
2. How to make digital video smaller?

(1) Why digital video has to be reduced?

Just think about when and how you enjoy video in your daily life.
-1. from the Internet (YouTube, CNN news...)
-2. Hollywood style movie on a DVD video disk (DVD player, PC or Mac)
-3. On your handheld device (iPad, iPhone, iPod...)
-4. Since 2009, your home TV is receiving digital video broadcast (through Cable, Dish or Phone line..)

As we learned before, a one-hour movie in standard (640 x 480) pixels is just about 82.8 GB, which means a two-hour movie comes out to be 165 GB.
A two-hour movie in HD (1920 x 1080) pixels goes beyond 600 GB.

One DVD disk can hold 4.5GB date, so it will take almost 20(twenty!) DVDs for a standard two-hour movie and 135 DVD disks for a two-hour HD movie.
You have to change DVDs every 6 minutes with standard movie and every 1 minute with HD movie!
Your iPad, iPhone or iPod can't hold that big date.
When you want to share your home movie on the Internet, the file is way, way too big for upload.

More importantly, the information is too big to be processed and moved around inside the computer and network system.

The digital movie has to be much more compact, at least we have to reduce to the size that will fit on a single disk.

So, that's the necessity of digital video compression, and people started developing many ideas to make it happen.

 

(2) How to make digital video smaller?

Let's go back and think -"What defines Video file size".

Video size was defined by the following formula:
Video size
= Width x Height x color depth x FPS(frame per second) x runtime in seconds
= Frame size x number of frame pre second x runtime.

So we've realized that there are 3 elements for reducing Video file size, they are:
-1. make frame size smaller
-2. send less frames per second
-3 make the movie shorter
(e.g. YouTube is limited to 10 minutes, but logical speaking, this is too obvious, to we don't discuss this )

We talk about -2) send less frame per second, topic first, followed by -1) make frame size smaller, because there are a lot to talk about how to "shrink" frame size than reducing FPS (Frame Per Second) number.

 

-1. How to reduce FPS (Frames per second) number? - Constant bit rate (CBR) vs. Variable bit rate (VBR) -

As noted above BPP represents the average bits per pixel. There are compression algorithms that keep the BPP almost constant throughout the entire duration of the video.
In this case video will be output with a constant bit rate (CBR). This CBR video is suitable for real-time, non-buffered, fixed bandwidth video streaming (e.g. in videoconferencing).

Noting that not all frames can be compressed at the same level because quality is more severely impacted for scenes of high complexity some algorithms try to constantly adjust the BPP. For example, an interview movie where the background is almost the same for entire footage is a good candidate for CBR. However, in a baseball game movie, when a player is running, we can't reduce frame numbers that much so the motion gets sluggish, but frame numbers reduction is not so noticeable when the game is going slowly. In other word, they keep it high while compressing complex scenes and low for less demanding scenes. This way one gets the best quality at the smallest average bit rate (and the smallest file size accordingly). Of course when using this method the bit rate is variable because it tracks the variations of the BPP. Variable bit rate (VBR)encoding is also commonly used on MPEG-2 video, MPEG-4 Part 2 video (Xvid, DivX, etc), MPEG-4 Part 10/H.264 video, Theora, Dirac and other video compression formats.


How to compress video frames?

There are 2 basic methods to reduce video frame size.
-1. Intra-frame compression (shrink each frame, one by one)
-2. Inter-frame compression (shrink a group of frames together)

Intra-frame Compression: Shrink each frame one by one method
Shirk each frame, all of them, say 1/20 size once, then expand them back to its original size when viewing it. The file size is smaller only when stored (ex. DVD disk, on hard drive in an iPhone or up/down on the Internet), then restored when viewed. This is called Intra-frame compression.

Inter-frame compression: Shrink a group of frames for more efficient compress. More efficient compression method from #3.
This is called Inter-frame compression.

Either one-by-one or as a group, the basic idea for compressing Digital Video File is that we will shrink (compress ) the original video size to 1/20 or even smaller for storage on media (like DVD or iPhone) or during transfer period through network, and it will be restored to its original video size when viewed.

The program for reducing the digital video file size is called Encoder, and the program for restoring it to its original size is called Decorder.
Encoder and Decorder is a pair of program that work together - Encoder is entrance where Decorder is exit.
This pair of Encoder / Decorder is called Codec.
Codec = Compressor + Decompressor = Encoder + Decorder

When we talk about "Codec", we are talking about both compressing and decompressing a digital video file.
W also know that there are two basic different kinds of Codec, one is based on Intra-frame compression (one frame each ) or Inter-frame compression (a group of frames together).

 

6. Codec

As explained before, there are two different approaches for compressing digital video size.
One is called Intra-frame compression and another is called Inter-frame compression.
The Intra-frame compression shrink one of each frame individually, one-by-one, while the Intre-frame compression will check a keyframe and compare many frames around that "keyframe", analyze how similar each other, and compress as a group of files.
The Inter-frame compression method's compression ratio is much larger than Intra-frame compression. In other word, you can make a video clip a lot smaller by using Inter-frame compression technique.

We will see pros and cons of both methods.

(1) Intra-frame Codec (Editable Codecs)

There are two well-known Intra-frame compression Codec's.
One is M-Jpeg and another is DV.
They both compress each frame individually.
The difference between them is that while M-Jpeg has flexible compression ratio setting, DV is fixed.

Let's see more details.

-1. M-Jpeg (Mjepg, Motion-Jpeg)

Summery
i) One of two popular Intra-frame compression Codec's.
ii) Good for Video Editing, because of individual frame accessibility and supports by many hard & software.
iii) Compression ratio is not high - much more than double the size of Inter-frame Codec's.
iv) Old technique, but will stay in for the future.

Sample file:
Mjpeg sample video (22 mb)


Motion JPEG uses a lossy form of Intra-frame compression. What basically happens here is, using a similar technique like Jpeg with still photo, Mjpeg compress each frame one by one individually.
Nearly all software implementations of M-Jpeg permit user control over the compression ratio (as well as other optional parameters), allowing the user to tradeoff picture quality for smaller video file size.
In embedded applications (such as miniDV, DVCam or DVCPro), the parameters are pre-selected and fixed for the application.

M-JPEG is an Intre-frame compression scheme (compared with the more computationally intensive technique of Inter-frame compression technique). Whereas modern Inter-frame video formats, such as MPEG1, MPEG2 and MPEG4/H.264, achieve real world compression ratios of 1:50 or better(smaller), M-Jpeg is 1:20 or lower (bigger). Because frames are compressed independently of one another, M-JPEG imposes lower processing and memory requirements on hardware devices.

As a purely Intra-frame_frame compression scheme, the image-quality of M-JPEG is directly a function of each video frame's static (spatial) complexity. Frames with large smooth-transitions or monotone surfaces compress well, and are more likely to hold their original detail with few visible compression artifacts. Frames exhibiting complex textures, fine curves and lines (such as writing on a newspaper) are prone to exhibit DCT-artifacts such as ringing, smudging, and macroblocking. M-JPEG compressed-video is also insensitive to motion-complexity, i.e. variation over time. It is neither hindered by highly random motion (such as the surface-water turbulence in a large waterfall), nor helped by the absence of motion (such as static landscape shot by tripod), which are two opposite extremes commonly used to test Intra-frame video-formats.

For QuickTime formats, Apple has defined two types of coding: MJPEG-A and MJPEG-B. MJPEG-B no longer retains valid JPEG Interchange Files within it, hence it is not possible to take a frame into a JPEG file without slightly modifying the headers.

M-Jpeg is often used for Non-Linear Video Editing system, because natively offers random access to any individual frame (no Inter-frame Codec allows this), it is a mature format, needs no special hardware on modern PCs, and , M-Jpeg is widely supported by almost all editing equipment and software.

Although the bitrate of M-Jpeg is lower than uncompressed video, it is much higher than that of video formats which use Inter-frame compensation such as MPEG-4. The large library of legacy software, low computational requirement, ease of editing ensure M-Jpeg content will be playable well into the future, even if the applications/equipment which created the content no longer exist.

For detailed specifications of Motion Jpeg, here is the link for Apple's Qick Time Format in Motion Jpeg (PDF file)

-2. DV

Summery
i) Good for None-Linear Video Editing, became Intra-frame compression Codec allows access to individual frames.
ii) File size is huge.
iii) Comes with different formats: DV, DVCam, DVCPro.

There is a comprehensive explanation about DV format.

 

(2) Inter-frame Codec

-1. Basic idea of Mpeg codec.


The goal of the Inter-frame compression is to minimize the number of pixels that are used by a digital movie while maintaining the differences between the original (uncompressed) image and the reconstructed image very minimum, or at least same to the human eye.
The Intra-frame compression, explained before, tries to compress each frame shrunk one by one, without thinking what image next to each other.
The Inter-frame Codecs realized there are a lot can be done to shrink a digital movie by grouping multiple frames together.
|First set a home frame (= Keyframe ). Analyze how different following frames are compared to the keyframe.
Using the key frame's info, apply new image(pixels) only to different portions on following frames.

a sample movie

Series of 4 frames

First, we collect detailed info of a main frame ( Keyframe).



Then, compare other frames to the keyframe.
Section D is almost identical through out the 4 frames( far up sky ) - we can use same backs.
Section A is almost same also - use most info through out 4 frames with little alternation.
Section B is very similar each other - little change will do.
Section C is tricky - left side is almost same, but right side is very different each other.

 



Next calculate how different next frame is, and next..among the entire group of frames.
This way, we can save a lot of digital info = we can reduce Bitrate

Quality and file size - We are getting better and better to compress digital video image:
-1. better image quality (much true to the un-compresed original)
-2. much more compact (file size is mush smaller than original video, HD movie through the Internet )


-2. History of Mpeg codecs

  Codec name Resolution NTSC /PAL Used for
1983 Mpeg-1 352x240 / 352x288 VHS tape to CD
1885 Mpeg-2 720x480 / 720x576 DVD disk
1999 Mpeg-4 part2 up to 1920 x 1080 Xvid, Dvix
2003 Mpeg-4 part10

up to 1920 x 1080 Quick Time, Xvid, Dvix, AVCHD, Blu-ray
Also known as Mpeg4-AVC or H.264
         
         
         

 

7. Container Formats

(1) Definition of Container Format.


A container format (also called as wrapper format) is a file format whose specification describes how data(video and audio) and metadata are stored.

 

(2) Why Container formats are necessary?

Although video and audio streams can be coded and decoded (Codec) with many different algorithms, they have to work together in harmony. Video image and audio sound have to be synchronized precisely.
So we need a someone like an orchestra director, who can synchronized moving image with sound perfectly.
Along with this synchronization information needed to play back the various streams together, like a Hollywood movie DVD, it has to handle caption subtitles, different languages, chapters and so on.

If you are given an encoded video with an encoded sound, how can you synchronize them together, or
if you try to give someone a set of video and audio files, we need standard format, under which a certain video and audio can work together properly.

Container format is necessary for organizing different video files and audio files work together.



Simpler container formats can contain different types of audio Codecs, while more advanced container formats can support multiple audio and video streams, subtitles, chapter-information, and metadata (tags).

Don't get confused container format with Codec.
The Codec is about shrinking and restore video and audio files, while container format is to organize different (already encoded) files work to together

Let's put it in other way; In video terminology, container format is a type of file format that contains various types of data compressed by Codecs. Container formats don't specify what Codec the container format uses, but rather it defines how the video, audio and other data is stored within the container so they can collaborate well.

To drive a car in a container, you need a key for the container and another key for the car.
To play a video in a container format, you need a key for the container (a container file) and a key for the video (Codec).
To play a Divx encoded video in AVI container format, fist you need an AVI container file, which "Windows Media Player" has it, and second, you need a key for the video, which is Decorder (Codec) of Divx. In other word, you need two things on your PC/Mac to play a video, 1) container file and 2) Codec.

 

(3) Examples

AVI (Audio Video Interleave) (.AVI)- Windows
As the name suggests, AVI files can contain both audio and video data in a file container that allows synchronous audio-with-video playback. AVI was introduced by Microsoft in November 1992 as part of its Video for Windows technology.

Since AVI container was developed by Microsoft, as long as you have Windows Media Player on your Windows system, you can open AVI container.
(if you can play the contents inside the container is an issue of Codec, not AVI container itself).
It can be used for uncompressed video and audio or streams compressed with a VfW Codec. Common cadets for AVI files include DivX, XviiD and DV.

Sample files:
AVI container / Divix codec (2 mb)
AVI container / H264 codec (1.2 mb)

 

Quick Time (.MOV) - Apple
QuickTime is used by Apple's QuickTime encoding and playback software. It uses an extension of .MOV. It's also very closely related to the MP4 container, which was originally based on this container. Video streams found in QuickTime containers are generally MPEG-4 AVC, with audio typically being AAC.

Apple released the first version of QuickTime on December 2, 1991 it was an astounding technological breakthrough at the time. Microsoft's competing technology?Video for Windows?employed several thousand lines of allegedly stolen Quicktime source code and did not appear until November 1992.
"In August 1997, Apple and Microsoft announced a settlement deal. Apple would drop all current lawsuits, including all lingering issues from the Apple Computer, Inc. v. Microsoft Corp. lawsuit and the "QuickTime source code" lawsuit, and agree to make Internet Explorer the default browser on the Macintosh unless the user explicitly chose the bundled Netscape browser. In return, Microsoft agreed to continue developing Office, Internet Explorer, and various developer tools and software for the Mac for the next five years, and purchase $150 million of non-voting Apple stock." ( more details on Wikia)

Sample files:
MOV container / H.264 codec (in HD format 1920 x 1080) (22 mb)
MOV contaienr / Divix codec (1.6 mb)
MOV container / mpeg4 codec
(720 x 576) (1 mb)

 

MP4 (.MP4, M4V, M4V or M4P)
MP4 MP4 is the official container for MPEG-4 video, as defined by MPEG-4 Part 14. It's based on the container Apple developed for QuickTime. Extension is MP4, however, some files containing only video use .M4V and audio only will use .M4A.
.M4P is a protected files with DRM from iTunes stores.
The streams normally found in the MP4 container are MPEG-4 ASP, MPEG-4 AVC video and AAC audio.

Sample files:
MP4 container / Mpeg-4 codec (640 x 480 ) (1.6 mb)
MP4 container / H.264 codec (extension M4V) (0.5 mb)


Mpeg-2 Program Stream (.Mpeg)
MPEG-2 Program Streams is a container format for Mpeg-2 video and interleave audio streams.
The extension is either .MPG or .MPEG. These files typically contain MPEG-2 video and either MPEG-1 Layer 2 or MPEG-2 Layer 2, or AC-3 (Dolby Digital) audio.

MPEG-2 Transport Stream(.TS)
The MPEG-2 Transport Stream container is designed to deliver multiple sets of streams together that would normally have to be delivered as separate files. Rather than limiting the container to a single video stream, as most containers including the standard MPEG-2 Program Stream are, this container allows multiple video streams to be delivered simultaneously. This makes it suitable for various broadcast type applications like Streaming video across the Internet and satellite television. .TS is also used on Blu-ray Disc Video
These files typically contain MPEG-2 video and either MPEG-1 Layer 2 or MPEG-2 Layer 2, or AC-3 (Dolby Digital) audio

VOB (.VOB)
DVD-Video uses a variation on the standard MPEG-2 container. This is generally referred to as the VOB container, because the file extension used on DVDs is .VOB for Video Object. Although there is additional information added that's not part of the original MPEG container, most software designed for reading MPEG files should be able to read streams from the VOB container.

WMV and ASF
ASF (Advanced Systems Format, Advanced Streaming Format, or Active Streaming Format).is a Microsoft's container used for Windows Media Video and Audio streams. ASF files may have an extension of .ASF, or may use WMA or WMV denoting audio or video.

Sample files:
WMV container / Windows Media Player 9 codec (640 x 480 standard mode ) (1.2 mb)
WMV container / Windows Media Player 9 codec (in 720P=1280 x 720 HD mode ) (6.5 mb)