A brief overview of the Closed Captioning System implemented in the United States


The current method for encoding caption information was originally developed by the National Bureau of Standards to transmit time of day information, it was later expanded to include captioning. The Public Broadcasting System currently handles the task of transmitting of time of day information and devices such as VCR's may be programmed to listen for this time of day information and automatically adjust their clocks. Preliminary captioning service first began in March of 1980 with basic text capability that allowed a transcript to scroll continuously at the bottom of the screen, the Electronics Industries Association later defined a set of expanded capabilities under their standard EIA-608. EIA-608 is a reccomendation and implementations of closed captioning are not forced to adhere to it, however it has become the standard in the United States. In 1993 the name CaptionVision was adopted for this system.

During the advent of captioning the most common method for display was to use set-top decoder boxes (external) instead of the integrated chipset based decoders (built into televisions) which would eventually become the dominant implementation. The set-top decoders never sold very well, only reaching a saturation of about 350,000 devices perhaps. To help foster the pervasiveness of closed captioning on both the transmitting and recieving end Congress passed the Television Decoder Circuitry Act in 1990.

Technical Information

Although captioning was not forseen and planned for when the original NTSC video standard was developed, it has been possible to integrate the transmission of caption and other digital information into the signal while maintaining backwards compatability. It is the task of the recieving device (television/monitor/set-top box) to decode and render the text data into a readable format.

The current standard provides roughly 120 characters per second encoded in an NTSC video signal. To achieve this two bytes of data are encoded into line 21 of the vertical blanking interval of each field. The vertical blanking interval is (essentialy) the brief period of time between subsequent video fields, there are sixty fields per second which yields thirty frames per second (since two fields are used to display one frame). The alternating fields carry even and odd scan lines respectively. Line 21 of each video field carries seperate caption information, so there are captions 1 & 2 and text 1 & 2 in the first field, and captions 3 & 4 and text 3 & 4 in the second field. Although most characters may be transmitted with a single byte, some text as well as control codes require two bytes to be transmitted, this lowers the effective throughput of the system somewhat.

To help ensure that caption information is properly recieved, it was decided that the data should be encoded at a relatively low speed. The data is preceded by a seven cycle sine wave and three start bits (0,0,1). There are two bytes of data encoded on the scan line (line 21) comprised of seven data bits and one parity bit (odd parity). The rise time is controlled and there is an amplitude of 50 IRE units. Tests conducted by PBS indicated that people have a typical reading rate of 125 words per minute for captioning, this is well below the approximate maximum of 500 words per minute which leaves a good deal of excess bandwidth in the second video field for other types of data. Control codes (non-character information) require two bytes to be transmitted and are typically transmitted for two successive frames (called byte pair doubling) to ensure that they are recieved. In the future expect to see enhanced captioning capability emerge under the new standard EIA-708.

Extended Data Services (XDS or EDS) is a feature which is not entirely implemented yet but will allow the second video field to carry information related to the content of the current broadcast. This information will be encapsulated in a packet and provide details such as the time of day, station and network identification, and the name of the program. Another subset of data which resides in the second video field is standard EIA-744-A, subject of some contention and commonly known as the V-chip. This data provides rating information about the current broadcast which allows televisions equipt with a V-chip compliant decoder to filter programming based on rating information.

It is possible to obtain a copy of EIA-608 for $106 by writing to the follow address.

Electronic Industries Association
Engineering Department
2001 Pennsylvania Avenue, N.W.
Washington, D.C. 20006
202 457-4900

Display Capabilities

Captions may be presented in a few different ways
  • Pop-On : the text must be recieved in entirity before display, so there may be as much as a one second delay before being revealed, however it is found to be less distracting than other methods
  • Paint-On : from left to right
  • Roll-Up : from the bottom of the screen
It is possible to change the color of the text (but not the color of the background) to any of the following; white, red, blue, green, yellow, cyan, and magenta. It is also possible to render characters in italics or with an underline.

