WO2002063562A2 - 3-d recursive vector estimation for video enhancement - Google Patents

3-d recursive vector estimation for video enhancement Download PDF

Info

Publication number
WO2002063562A2
WO2002063562A2 PCT/IB2002/000275 IB0200275W WO02063562A2 WO 2002063562 A2 WO2002063562 A2 WO 2002063562A2 IB 0200275 W IB0200275 W IB 0200275W WO 02063562 A2 WO02063562 A2 WO 02063562A2
Authority
WO
WIPO (PCT)
Prior art keywords
pixel region
enhancement
candidate
enhanced
spatio
Prior art date
Application number
PCT/IB2002/000275
Other languages
French (fr)
Other versions
WO2002063562A3 (en
Inventor
Erwin B. Bellers
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/840,817 external-priority patent/US7042945B2/en
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to EP02716230A priority Critical patent/EP1360648A2/en
Priority to KR1020027013488A priority patent/KR20020087128A/en
Priority to JP2002563430A priority patent/JP2004519145A/en
Publication of WO2002063562A2 publication Critical patent/WO2002063562A2/en
Publication of WO2002063562A3 publication Critical patent/WO2002063562A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0135Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes
    • H04N7/014Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes involving the use of motion vectors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows

Definitions

  • the present invention is directed, in general, to video enhancement systems and, more specifically, to maintaining spatio-temporal consistency during video enhancement.
  • Frequency peaking involves linear boosting or "peaking" of selected spatial frequencies within the image, often with a bandpass or highpass filter to enhances the associated spatial frequencies and with adaptive control to avoid “unnaturalness” relating to, for example, peaking large and steep edges.
  • luminance transient improvement preserves the magnitude of the edge but increases the steepness of the edge, "pulling" samples near the edge on both sides towards the edge.
  • Existing edge enhancement algorithms enhance the sharpness of an image based on the spatial information of the original image, often utilizing control parameters determined by a small spatial neighborhood of a given pixel position.
  • a primary object of the present invention to provide, for use in a video signal processor, a technique for enhancing video information which evaluates candidate vectors of enhancement algorithms utilizing an error function biased towards spatio-temporal consistency with a penalty function.
  • the penalty function increases with the distance— both spatial and temporal— of the subject block from the block for which the candidate vector was optimal. Enhancements are therefore gradual across both space and time and the enhanced video information is intrinsically free of perceptible artifacts.
  • Fig. 1 depicts a system in which video enhancement with spatio-temporal consistency is implemented according to one embodiment of the present invention
  • Fig. 2 illustrates in greater detail a system for video enhancement with spatio- temporal consistency according to one embodiment of the present invention
  • Fig. 3 illustrates a logical organization of video information for video enhancement with spatio-temporal consistency according to one embodiment of the present invention
  • Fig. 4 is a high level flow chart for a process of video enhancement with spatio-temporal consistency according to one embodiment of the present invention
  • Fig. 5 is an illustration of displacement of a moving object from an expected position as a result of field rate conversion through field repetition;
  • Figs. 6 A and 6B are comparative illustrations for spatial resolution enhancement.
  • Fig. 1 depicts a system in which video enhancement with spatio-temporal consistency is implemented according to one embodiment of the present invention.
  • System 100 includes a receiver 101, which in the exemplary embodiment is a high definition digital television (HDTV) large-screen or wide-screen television receiver.
  • receiver 101 may be an intermediate transceiver or any other device employed to receive or transceive video signals, as for example a transceiver retransmitting video information for reception by a high definition television.
  • receiver 101 includes a video enhancement mechanism as described in further detail below.
  • Receiver 101 includes an input 102 for receiving video signals and may optionally include an output 103 for transmitting enhanced video signals to another device.
  • Nideo signal processor 201 includes an enhancement vector estimator 202 and enhancement processor 203 which perform the video enhancement processing.
  • Nideo signal processor 201 in the exemplary embodiment is the device from which the enhanced video output is transmitted either to display 104 or to a storage medium (not shown).
  • Enhancement processor 203 performs the processing on received video signals required to enhance the video for display.
  • Image or video enhancement is a broad area which may be roughly divided into three categories: restoration of "lost" (image/video) information; elimination of artifacts; and enhancement of selected image/video characteristics.
  • restoration of "lost" (image/video) information is a broad area which may be roughly divided into three categories: restoration of "lost" (image/video) information; elimination of artifacts; and enhancement of selected image/video characteristics.
  • the present invention is not limited to any particular category of video enhancement, for the purposes of simplicity resolution enhancement, which falls within the third category, will be utilized to describe and explain the invention. Nonetheless, those skilled in the art will understand that the invention may be readily adapted or extended to video enhancements other than resolution enhancement and falling within any of the three categories listed.
  • Enhancement processor 203 together with enhancement vector estimator 202 in the exemplary embodiment, performs spatial resolution enhancement on the video information received.
  • the technique for estimation of enhancement vectors according to the present invention is similar to the recursive search block matching motion estimation process described in the references identified above.
  • Fig. 3 illustrates a logical organization of video information for video enhancement with spatio-temporal consistency according to one embodiment of the present invention.
  • the organization depicted is employed for block enhancement by video signal processor 201 depicted in Fig. 2.
  • the video information to be enhanced includes a plurality of successive pictures (which may be either fields or frames) to be displayed in sequence at a predefined rate.
  • "Successive,” as used herein, refers to a subject picture being in consecutive series with another picture within the sequence, without regard to whether the subject picture is before or after the other picture within the video information.
  • a portion of the sequence of pictures, n-2, n-1, n, n+1 and n+2 is shown in Fig. 3.
  • Each picture comprises a two- dimensional array of pixels having coordinates (x,y) from the lower left corner of the
  • Block enhancement units 206a-206n within video signal processor 201 enhance the received video information on a per block basis.
  • spatial resolution enhancement will be employed to explain the present invention. Specifically, an increase in the spatial resolution of the incoming video by a factor of two in both spatial dimensions of the fields will be employed to describe the present invention.
  • An initial estimate of higher spatial resolution video information G(x, ⁇ ) based on the lower resolution video information F(x, ) may be initially created by a simple spatial up-conversion—that is, a sample-rate conversion interpolation filter within block enhancement units 206a-206n is employed to obtain a higher resolution image.
  • W, () within the above equation indicates enhancement of the image data quality (where spatial resolution has already been enhanced by sample-rate conversion) by an algorithm i within a set of algorithms.
  • W 0 (F(x,n)) could be the image data after frequency peaking while W, (F(x, n)) may be the result after luminance transient improvement.
  • the penalty P ⁇ within the error function given above is a monotonic decreasing function of the norm of the enhancement vector V , introducing a large penalty for small coefficients and a small penalty for large coefficients.
  • the penalty P 2 is employed to bias the enhancement vector V towards a spatial-temporally consistent solution since this penalty depends on the selected enhancement vector candidate C . Accordingly, the value of penalty P 2 is selected from a predefined list of penalty values which are optimized for the application.
  • Each enhancement vector candidate C is preferably selected from enhancement vectors previously determined to produce the smallest error function values for blocks within a spatio-temporal neighborhood around the block B(X) being processed.
  • a "Y-prediction" estimator for recursive search block matching motion estimation in which spatial prediction vector candidates C sp and C SP2 are the vectors selected for blocks one block dimension above and to either side of and within the same field as the subject block B(X) while a temporal prediction candidate
  • C TP is the vector selected for a block two blocks directly below and within the previous field n -1 from the field n containing the subject block B(X) .
  • Selection of candidate enhancement vectors from the enhancement vectors which produced optimal results within the spatio-temporal neighborhood of the subject block B( ⁇ ) speeds the process of determining the best enhancement (the enhancement vector which produces the smallest error, or other suitable criteria for enhancement results, for the subject block B(X) ) since it is very likely that enhancement(s) similar to those producing the best results for other blocks within the neighborhood of the subject block B(X) will produce the best results for the subject block B(X) .
  • all possible candidate vectors of enhancement algorithms may be tested for each block.
  • the set of candidate vectors employed may change during processing of the video information, with, for example, all possible candidate vectors being tested for the first few fields of the video information and then a smaller subset of candidate vectors being employed for remaining fields, or with the selection of candidate vectors being otherwise refined as the video information is processed.
  • one candidate is always updated with a random update vector.
  • Several candidates may compete with each other, with the candidate yielding the smallest error ⁇ (C,X,n) being selected as the enhancement vector for the data within the subject block B(X) .
  • a block within a current field of the received video information is first selected (step 402) and a simple enhancement, in this case sample rate conversion, is performed.
  • the block is also enhanced utilizing each of a plurality of selected candidate enhancement vectors consisting one or more enhancement algorithms employed jointly or individually, such as frequency peaking and luminance transient improvement.
  • An error function value where the error function includes a bias towards spatio-temporal consistency, is then computed for each candidate enhancement vector (step 404) and the enhancement corresponding to the candidate vector having the lowest error function value is selected (step 405) for display as part of the enhanced field.
  • a determination as to whether all blocks within the current field have been processed is then made, followed by selection and processing of a next block within the current field (step 407) if additional blocks remain and initiation of processing on the next field (step 408) if the current field has been completely processed. Once initiated, the process proceeds until interrupted by an external influence, such as the receiver being turned off or the reception of video information being interrupted.
  • machine usable mediums include: nonvolatile, hard-coded type mediums such as read only memories (ROMs) or erasable, electrically programmable read only memories (EEPROMs), recordable type mediums such as floppy disks, hard disk drives and compact disc read only memories (CD- ROMs) or digital versatile discs (DNDs), and transmission type mediums such as digital and analog communication links.
  • ROMs read only memories
  • EEPROMs electrically programmable read only memories
  • CD- ROMs compact disc read only memories
  • DNDs digital versatile discs

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Television Systems (AREA)
  • Image Processing (AREA)
  • Picture Signal Circuits (AREA)

Abstract

A video signal processor enhancing video information evaluates candidate vectors of enhancement algorithms utilizing an error function biased towards spatio-temporal consistency with a penalty function. The penalty function increases with the distance--both spatial and temporal--of the subject block from the block for which the candidate vector was optimal. Enhancements are therefore gradual across both space and time and the enhanced video information is intrinsically free of perceptible spatio-temporal varying artifacts.

Description

3-D recursive vector estimation for video enhancement
The present invention is directed, in general, to video enhancement systems and, more specifically, to maintaining spatio-temporal consistency during video enhancement.
Many contemporary high performance televisions, particularly large screen and wide screen versions, utilize a spatial-temporal resolution which is higher than the normal resolution and refresh rate. For example, a 100 Hertz (Hz) screen refresh rate may be employed for the television display rather than the standard 50 or 60 Hertz. However, because the field rate—the number of interlaced screen images or "fields" for the television— within the program signal received will typically be only 50 fields per second, the number of fields for display must be doubled.
For digital televisions employing a field memory—a memory with the capacity to store a digitized version of a complete television field—one technique for doubling the field rate involves simply writing to the field memory at a first rate and reading from the field memory at a second rate which is double the first rate. However, such field rate up- conversion by simple field repetition results in each movement phase (i.e., frame) being displayed multiple times, with moving objects appearing slightly displaced from their expected spatio-temporal (space-time) position in the repeated movement phases as illustrated in Fig. 5.
The space-time positioning 501a, 501b and 501c of an object moving linearly across the screen within a sequence of three fields n-2, n-1 and n is shown in Fig. 5. Field rate up-conversion by field repetition produces intermediate fields (not labeled) in which the space-time positioning of the object is 503a, 503b and 503c rather than the expected space- time positioning of 502a, 502b, and 502c.
While the displacement is almost unnoticeable to the human eye at video information captured at normal field rates (50-60 Hz) employed by video cameras and the like, motion picture cameras have, for historical electro-mechanical reasons, operated at a capture rate of 24 frames per second. While modern motion picture cameras have been improved, much film exists which was recorded at that previously-standard capture rate. Such film is normally converted for television display by running the film at approximately 25 frames per second and then scanning each frame twice such that adjacent pairs of identical fields are created within the video information.
When up-converting a television formatted motion picture to a higher field rate utilizing simple field repetition, the already duplicated fields are again duplicated, creating sequences of four identical fields within the video information and resulting in a significant amount of motion jitter and picture blurring. To address these problems, motion compensation techniques such as three dimensional (3-D) recursive search block matching have been developed to provide motion-compensated interpolation. See, for example, G. de Haan, Motion Estimation and Compensation - An Integrated Approach to Consumer Display Field Rate Conversion (ISBN 90-74445-01 -2) and G. de Haan et al, "True-Motion
Estimation with 3-D Recursive Search Block Matching," IEEE Tr. On Circuits and Systems or Video Technology, 3(5):368-379 (October 1993).
High definition television (HDTV) often imposes a requirement differing from— and either in addition to or in lieu of— field rate up-conversion: image resolution enhancement. As illustrated in Fig.S 6A and 6B, image resolution enhancement requires up- conversion from one resolution and the corresponding pixel size 601a and/or pixel density 602a to a higher resolution having a smaller pixel size 601b and/or greater pixel density 602b. Known interpolation techniques are employed to generate the additional pixels required from the original video information. As known in the art, the shape or magnitude of edges within an image significantly contribute to the overall impression of "sharpness" for the image. Accordingly, various edge enhancement techniques such as frequency peaking and luminance transient improvement (LTI) have been developed for use during image resolution enhancement. Frequency peaking involves linear boosting or "peaking" of selected spatial frequencies within the image, often with a bandpass or highpass filter to enhances the associated spatial frequencies and with adaptive control to avoid "unnaturalness" relating to, for example, peaking large and steep edges. Unlike frequency peaking, luminance transient improvement preserves the magnitude of the edge but increases the steepness of the edge, "pulling" samples near the edge on both sides towards the edge. Existing edge enhancement algorithms enhance the sharpness of an image based on the spatial information of the original image, often utilizing control parameters determined by a small spatial neighborhood of a given pixel position. While these techniques are generally sufficient for still images, time varying conditions within video information such as (but not limited to) noise, motion, or lighting conditions, or even spatio-temporal varying conditions, may cause annoying artifacts in the processed video information. Conservative tuning of the parameters may prevent such artifacts, but also constrains the enhancement.
There is, therefore, a need in the art for enhancement of video information with spatio-temporal consistency, or consistency of enhanced image data both with spatially surrounding (enhanced) image data in the field containing the enhanced image data and with counterpart or corresponding image data within subsequent fields.
To address the above-discussed deficiencies of the prior art, it is a primary object of the present invention to provide, for use in a video signal processor, a technique for enhancing video information which evaluates candidate vectors of enhancement algorithms utilizing an error function biased towards spatio-temporal consistency with a penalty function. The penalty function increases with the distance— both spatial and temporal— of the subject block from the block for which the candidate vector was optimal. Enhancements are therefore gradual across both space and time and the enhanced video information is intrinsically free of perceptible artifacts.
The foregoing has outlined rather broadly the features and technical advantages of the present invention so that those skilled in the art may better understand the detailed description of the invention that follows. Additional features and advantages of the invention will be described hereinafter that form the subject of the claims of the invention. Those skilled in the art will appreciate that they may readily use the conception and the specific embodiment disclosed as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. Those skilled in the art will also realize that such equivalent constructions do not depart from the spirit and scope of the invention in its broadest form. Before undertaking the detailed description of the invention below, it may be advantageous to set forth definitions of certain words or phrases used throughout this patent document: the terms "include" and "comprise," as well as derivatives thereof, mean inclusion without limitation; the term "or" is inclusive, meaning and/or; the phrases "associated with" and "associated therewith," as well as derivatives thereof, may mean to include, be included within, interconnect with, contain, be contained within, connect to or with, couple to or with, be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or with, have, have a property of, or the like; and the term "controller" means any device, system or part thereof that controls at least one operation, whether such a device is implemented in hardware, firmware, software or some combination of at least two of the same. It should be noted that the functionality associated with any particular controller may be centralized or distributed, whether locally or remotely. Definitions for certain words and phrases are provided throughout this patent document, and those of ordinary skill in the art will understand that such definitions apply in many, if not most, instances to prior as well as future uses of such defined words and phrases.
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, wherein like numbers designate like objects, and in which:
Fig. 1 depicts a system in which video enhancement with spatio-temporal consistency is implemented according to one embodiment of the present invention;
Fig. 2 illustrates in greater detail a system for video enhancement with spatio- temporal consistency according to one embodiment of the present invention; Fig. 3 illustrates a logical organization of video information for video enhancement with spatio-temporal consistency according to one embodiment of the present invention;
Fig. 4 is a high level flow chart for a process of video enhancement with spatio-temporal consistency according to one embodiment of the present invention; Fig. 5 is an illustration of displacement of a moving object from an expected position as a result of field rate conversion through field repetition; and
Figs. 6 A and 6B are comparative illustrations for spatial resolution enhancement.
Figs. 1 through 4, discussed below, and the various embodiments used to describe the principles of the present invention in this patent document are by way of illustration only and should not be construed in any way to limit the scope of the invention. Those skilled in the art will understand that the principles of the present invention may be implemented in any suitably arranged device.
Fig. 1 depicts a system in which video enhancement with spatio-temporal consistency is implemented according to one embodiment of the present invention. System 100 includes a receiver 101, which in the exemplary embodiment is a high definition digital television (HDTV) large-screen or wide-screen television receiver. Alternatively, however, receiver 101 may be an intermediate transceiver or any other device employed to receive or transceive video signals, as for example a transceiver retransmitting video information for reception by a high definition television. In any embodiment, receiver 101 includes a video enhancement mechanism as described in further detail below. Receiver 101 includes an input 102 for receiving video signals and may optionally include an output 103 for transmitting enhanced video signals to another device. In the exemplary embodiment, receiver 101 includes a high definition television display 104 upon which images rendered or otherwise generated according the enhanced video information are displayed. Those skilled in the art will perceive that Fig. 1 does not explicitly depict all components within the high definition television receiver of the exemplary embodiment. Only so much of the commonly known construction and operation of a high definition television receiver and the components therein as are unique to the present invention and/or required for an understanding of the present invention are shown and described herein. Fig. 2 illustrates in greater detail a system for video enhancement with spatio- temporal consistency according to one embodiment of the present invention. Receiver 101 includes a video signal processor 201, which may be implemented by a single integrated circuit device or a combination of integrated circuit devices. Nideo signal processor 201 includes an enhancement vector estimator 202 and enhancement processor 203 which perform the video enhancement processing. Nideo signal processor 201 in the exemplary embodiment is the device from which the enhanced video output is transmitted either to display 104 or to a storage medium (not shown).
Enhancement processor 203 performs the processing on received video signals required to enhance the video for display. Image or video enhancement is a broad area which may be roughly divided into three categories: restoration of "lost" (image/video) information; elimination of artifacts; and enhancement of selected image/video characteristics. Although the present invention is not limited to any particular category of video enhancement, for the purposes of simplicity resolution enhancement, which falls within the third category, will be utilized to describe and explain the invention. Nonetheless, those skilled in the art will understand that the invention may be readily adapted or extended to video enhancements other than resolution enhancement and falling within any of the three categories listed.
Enhancement processor 203, together with enhancement vector estimator 202 in the exemplary embodiment, performs spatial resolution enhancement on the video information received. The technique for estimation of enhancement vectors according to the present invention is similar to the recursive search block matching motion estimation process described in the references identified above.
To perform video enhancement, enhancement vector estimator 202 includes one or more caches 205a-205n for temporary storage of pixel information relating to processing of a block of pixels, one or more block enhancement units 206a-206n, an enhancement vector memory 207, and a best enhancement selection unit 208 which identifies and selects the best enhancement on a per block basis as described in further detail below.
Fig. 3 illustrates a logical organization of video information for video enhancement with spatio-temporal consistency according to one embodiment of the present invention. The organization depicted is employed for block enhancement by video signal processor 201 depicted in Fig. 2. The video information to be enhanced includes a plurality of successive pictures (which may be either fields or frames) to be displayed in sequence at a predefined rate. "Successive," as used herein, refers to a subject picture being in consecutive series with another picture within the sequence, without regard to whether the subject picture is before or after the other picture within the video information. A portion of the sequence of pictures, n-2, n-1, n, n+1 and n+2, is shown in Fig. 3. Each picture comprises a two- dimensional array of pixels having coordinates (x,y) from the lower left corner of the
picture, where the array function F(x,n) represents the pixel value at position and
Figure imgf000007_0001
field number n within the video information at an initial (lower) spatial resolution. Each picture is logically divided into an array of blocks of pixels B(X) of a predetermined number of pixels in width and height and having a center X . The blocks or pixel regions may be rectangular as depicted or may be any other shape.
Block enhancement units 206a-206n within video signal processor 201 enhance the received video information on a per block basis. As noted above, spatial resolution enhancement will be employed to explain the present invention. Specifically, an increase in the spatial resolution of the incoming video by a factor of two in both spatial dimensions of the fields will be employed to describe the present invention.
An initial estimate of higher spatial resolution video information G(x,ή) based on the lower resolution video information F(x, ) may be initially created by a simple spatial up-conversion— that is, a sample-rate conversion interpolation filter within block enhancement units 206a-206n is employed to obtain a higher resolution image. The down-conversion operation T() which defines down-conversion of the high resolution video information G(x,n) to low resolution video information F(x,n) , given by F(x,n) = T(G(x,n)) , is employed in an error criterion for selecting the best enhancement of a given block B(X) . The error criterion, a measure for performance of the enhanced video information G(x,n) , is based on differences between the initial low resolution video information F(x,ή) and the low resolution video information F(x,n) obtained by down- converting the high resolution video information G(x, ) and is given by: ε(C,X,n) = + P2(C)
Figure imgf000008_0001
where C is a candidate for the enhancement vector V = {v0,v1,...,vm) consisting of coefficients which are utilized to create video information G(x,ή) according to:
G(x,n) =
Figure imgf000008_0002
(v,W,(F(x,n))) .
W, () within the above equation indicates enhancement of the image data quality (where spatial resolution has already been enhanced by sample-rate conversion) by an algorithm i within a set of algorithms. For example, W0(F(x,n)) could be the image data after frequency peaking while W, (F(x, n)) may be the result after luminance transient improvement. The penalty Pλ within the error function given above is a monotonic decreasing function of the norm of the enhancement vector V , introducing a large penalty for small coefficients and a small penalty for large coefficients. The penalty P2 is employed to bias the enhancement vector V towards a spatial-temporally consistent solution since this penalty depends on the selected enhancement vector candidate C . Accordingly, the value of penalty P2 is selected from a predefined list of penalty values which are optimized for the application.
Each enhancement vector candidate C is preferably selected from enhancement vectors previously determined to produce the smallest error function values for blocks within a spatio-temporal neighborhood around the block B(X) being processed. For example, one reference identified above suggests a "Y-prediction" estimator for recursive search block matching motion estimation, in which spatial prediction vector candidates Csp and CSP2 are the vectors selected for blocks one block dimension above and to either side of and within the same field as the subject block B(X) while a temporal prediction candidate
CTP is the vector selected for a block two blocks directly below and within the previous field n -1 from the field n containing the subject block B(X) . Selection of candidate enhancement vectors from the enhancement vectors which produced optimal results within the spatio-temporal neighborhood of the subject block B(ϊ ) speeds the process of determining the best enhancement (the enhancement vector which produces the smallest error, or other suitable criteria for enhancement results, for the subject block B(X) ) since it is very likely that enhancement(s) similar to those producing the best results for other blocks within the neighborhood of the subject block B(X) will produce the best results for the subject block B(X) .
Alternatively, all possible candidate vectors of enhancement algorithms may be tested for each block. Moreover, the set of candidate vectors employed may change during processing of the video information, with, for example, all possible candidate vectors being tested for the first few fields of the video information and then a smaller subset of candidate vectors being employed for remaining fields, or with the selection of candidate vectors being otherwise refined as the video information is processed. Preferably one candidate is always updated with a random update vector. Several candidates may compete with each other, with the candidate yielding the smallest error ε(C,X,n) being selected as the enhancement vector for the data within the subject block B(X) . As a result of the present invention, an enhancement vector which may be utilized with near-optimal results for spatial resolution up-conversion of a particular block is selected on a per-block basis. Spatio-temporal consistency is automatically achieved. Block erosion similar to, but not restricted to, the process disclosed in the references identified above may be employed to prevent blocking artifacts. Fig. 4 is a high level flow chart for a process of video enhancement with spatio-temporal consistency according to one embodiment of the present invention. The process 400, performed by the video signal processor 202 depicted in Fig. 2 utilizing the logical organization of video information illustrated in Fig. 3, begins with receipt (step 401) of video information for enhancement. As noted above, the process may be performed for various types of enhancements but spatial resolution enhancement will be employed to describe the process. A block within a current field of the received video information is first selected (step 402) and a simple enhancement, in this case sample rate conversion, is performed. The block is also enhanced utilizing each of a plurality of selected candidate enhancement vectors consisting one or more enhancement algorithms employed jointly or individually, such as frequency peaking and luminance transient improvement.
An error function value, where the error function includes a bias towards spatio-temporal consistency, is then computed for each candidate enhancement vector (step 404) and the enhancement corresponding to the candidate vector having the lowest error function value is selected (step 405) for display as part of the enhanced field. A determination as to whether all blocks within the current field have been processed (step 406) is then made, followed by selection and processing of a next block within the current field (step 407) if additional blocks remain and initiation of processing on the next field (step 408) if the current field has been completely processed. Once initiated, the process proceeds until interrupted by an external influence, such as the receiver being turned off or the reception of video information being interrupted.
The present invention allows enhancements to video information (other than position within a repeated field) to be processed in a manner inherently producing spatio- temporally consistent results. The error function employed to select the best enhancement vector of enhancement algorithms is biased towards spatio-temporal consistency by addition of a penalty increasing as candidate vectors differ from a block being enhanced by either space, time, or both. As a result, the selected enhancement produces changes which are gradual over space and time and inherently free of spatio-temporal varying artifacts.
It is important to note that while the present invention has been described in the context of a fully functional hard- ware based system and/or network, those skilled in the art will appreciate that the mechanism of the present invention is capable of being distributed in the form of a machine usable medium containing instructions in a variety of forms, and that the present invention applied equally regardless of the particular type of signal bearing medium utilized to actually carry out the distribution. Examples of machine usable mediums include: nonvolatile, hard-coded type mediums such as read only memories (ROMs) or erasable, electrically programmable read only memories (EEPROMs), recordable type mediums such as floppy disks, hard disk drives and compact disc read only memories (CD- ROMs) or digital versatile discs (DNDs), and transmission type mediums such as digital and analog communication links. Although the present invention has been described in detail, those skilled in the art will understand that various changes, substitutions and alterations herein may be made without departing from the spirit and scope of the invention in its broadest form.

Claims

CLAIMS:
1. For use in a receiver 101 , a video enhancement mechanism 201 for enhancing video information with spatio-temporal consistency comprising:
- at least one enhancement unit 206a enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector, each said enhanced pixel region equivalent to enhancement of said selected pixel region utilizing a respective candidate enhancement vector of enhancement algorithms; and
- a selection unit 208 computing an error for each said enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing said selected pixel region and with a counterpart pixel region in one or more pictures successive with said picture containing said selected pixel region, said selection unit selecting an enhanced pixel region having a best enhancement for spatio-temporal consistency.
2. The video enhancement mechanism 201 as set forth in claim 1 wherein said at least one candidate enhancement vector is selected from enhancement vectors determined to produce a best enhancement for spatio-temporal consistency in enhancing pixel regions within a spatial and temporal neighborhood of said selected pixel region.
3. The video enhancement mechanism 201 as set forth in claim 1 wherein said bias towards spatio-temporal consistency further comprises first and second penalties, said first penalty varying based upon coefficients for each candidate enhancement vector and said second penalty varying for each candidate enhancement vector.
4. The video enhancement mechanism 201 as set forth in claim 3 wherein said error is computed on a per-pixel region basis for each pixel region within said video information and for each candidate enhancement vector for a respective pixel region.
5. A high definition television receiver 101 comprising:
- a input connection 102 receiving video information; a display 104 on which enhanced images derived from said video information are displayed; and - an video enhancement mechanism 201 for enhancing said video information with spatio-temporal consistency comprising:
- at least one enhancement unit 206a enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector, each said enhanced pixel region equivalent to enhancement of said selected pixel region utilizing a respective candidate enhancement vector of enhancement algorithms; and
- a selection unit 208 computing an error for each said enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing said selected pixel region and with a counterpart pixel region in one or more pictures successive with said picture containing said selected pixel region, said selection unit selecting an enhanced pixel region having a best enhancement for spatio-temporal consistency.
6. The receiver 101 as set forth in claim 5 wherein said at least one candidate enhancement vector of enhancement algorithms is selected from enhancement vectors determined to produce a best enhancement for spatio-temporal consistency in enhancing pixel regions within a spatial and temporal neighborhood of said selected pixel region.
7. The receiver 101 as set forth in claim 5 wherein said bias towards spatio- temporal consistency further comprises first and second penalties, said first penalty varying based upon coefficients for each candidate enhancement vector and said second penalty varying for each candidate enhancement vector.
8. The receiver 101 as set forth in claim 6 wherein said error is computed on a per-pixel region basis for each pixel region within said video information and for each candidate enhancement vector for a respective pixel region.
9. For use in a receiver 101 , a method of enhancing video information with spatio-temporal consistency comprising:
- enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector, each enhanced pixel region equivalent to enhancement of the selected pixel region utilizing a respective candidate enhancement vector of enhancement algorithms;
- computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region; and
- selecting an enhanced pixel region having a best enhancement for spatio- temporal consistency.
10. The method as set forth in claim 9 wherein the step of enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector further comprises selecting the at least one candidate enhancement vector of enhancement algorithms from enhancement vectors determined to produce a best enhancement for spatio-temporal consistency in enhancing pixel regions within a spatial and temporal neighborhood of the selected pixel region.
11. The method as set forth in claim 9 wherein the step of computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region further comprises adding first and second penalties to the error as the bias, the first penalty varying based upon coefficients for each candidate enhancement vector and the second penalty varying for each candidate enhancement vector.
12. The method as set forth in claim 11 wherein the step of computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region further comprises computing the error on a per-pixel region basis for each pixel region within the video information and for each candidate enhancement vector for a respective pixel region.
13. A computer program product within a computer usable medium for enhancing video information with spatio-temporal consistency comprising:
- instructions for enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector, each enhanced pixel region equivalent to enhancement of the selected pixel region utilizing a respective candidate enhancement vector of enhancement algorithms;
- instructions for computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region; and
- instructions for selecting an enhanced pixel region having a best enhancement for spatio-temporal consistency.
14. The computer program product as set forth in claim 13 wherein the instructions for enhancing a characteristic other than position of a selected pixel region of video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate an enhanced pixel region for each candidate enhancement vector further comprise instructions for selecting the at least one candidate enhancement vector of enhancement algorithms from enhancement vectors determined to produce a best enhancement for spatio-temporal consistency in enhancing pixel regions within a spatial and temporal neighborhood of the selected pixel region.
15. The computer program product as set forth in claim 14 wherein the instructions for computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region further comprise instructions for adding first and second penalties to the error as the bias, the first penalty varying based upon coefficients for each candidate enhancement vector and the second penalty varying for each candidate enhancement vector.
16. The computer program product as set forth in claim 15 wherein the instructions for computing an error for each enhanced pixel region utilizing a bias towards spatio-temporal consistency of a respective enhanced pixel region with spatially adjacent pixel regions in a picture containing the selected pixel region and with a counterpart pixel region in one or more pictures successive with the picture containing the selected pixel region further comprise instructions for computing the error on a per-pixel region basis for each pixel region within the video information and for each candidate enhancement vector for a respective pixel region.
17. A video information signal comprising:
- a data stream containing one or more pictures; and
- at least one enhanced pixel region within at least one of said pictures, each enhanced pixel region derived from received video information by enhancing a characteristic other than position of a selected pixel region of said received video information utilizing at least one candidate enhancement vector of enhancement algorithms to generate a candidate enhanced pixel region for each candidate enhancement vector, each candidate enhanced pixel region equivalent to enhancement of said selected pixel region utilizing a respective candidate enhancement vector of enhancement algorithms,
- wherein each enhanced pixel region within a respective picture has a best enhancement for spatio-temporal consistency among said candidate enhanced pixel regions for an error utilizing a bias towards spatio-temporal consistency of said respective enhanced pixel region with spatially adjacent pixel regions in a picture containing said selected pixel region and with a counterpart pixel region in one or more pictures successive with said picture containing said selected pixel region.
18. The video information signal as set forth in claim 17 wherein said at least one candidate enhancement vector is selected from enhancement vectors determined to produce a smallest computed error value in enhancing pixel regions within a spatial and temporal neighborhood of said selected pixel region.
19. The video information signal as set forth in claim 17 wherein said bias towards spatio-temporal consistency comprises first and second penalties, said first penalty varying based upon coefficients for each candidate enhancement vector and said second penalty varying for each candidate enhancement vector.
20. The video information signal as set forth in claim 19 wherein each said enhanced pixel region within any picture is selected utilizing said error computed on a per- pixel region basis for each pixel region within said received video information and for each candidate enhancement vector for a respective pixel region.
PCT/IB2002/000275 2001-02-08 2002-01-28 3-d recursive vector estimation for video enhancement WO2002063562A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP02716230A EP1360648A2 (en) 2001-02-08 2002-01-28 Spatio-temporal video enhancement
KR1020027013488A KR20020087128A (en) 2001-02-08 2002-01-28 3-D Recursive vector estimation for video enhancement
JP2002563430A JP2004519145A (en) 2001-02-08 2002-01-28 Evaluation of 3D Inductive Vectors for Video Enhancement

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US26725601P 2001-02-08 2001-02-08
US60/267,256 2001-02-08
US09/840,817 2001-04-24
US09/840,817 US7042945B2 (en) 2001-04-24 2001-04-24 3-D recursive vector estimation for video enhancement

Publications (2)

Publication Number Publication Date
WO2002063562A2 true WO2002063562A2 (en) 2002-08-15
WO2002063562A3 WO2002063562A3 (en) 2002-10-17

Family

ID=26952332

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2002/000275 WO2002063562A2 (en) 2001-02-08 2002-01-28 3-d recursive vector estimation for video enhancement

Country Status (5)

Country Link
EP (1) EP1360648A2 (en)
JP (1) JP2004519145A (en)
KR (1) KR20020087128A (en)
CN (1) CN1460230A (en)
WO (1) WO2002063562A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004073313A1 (en) * 2003-02-13 2004-08-26 Koninklijke Philips Electronics N.V. Spatio-temporal up-conversion
WO2007089803A2 (en) * 2006-01-31 2007-08-09 Thomson Licensing Methods and apparatus for edge-based spatio-temporal filtering

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
AUFRICHTIG R ET AL: "Spatio-temporal X-ray fluoroscopy filtering using object detection" PROCEEDINGS OF THE COMPUTERS IN CARDIOLOGY CONFERENCE. LONDON, SEPT. 5 - 8, 1993, LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, 5 September 1993 (1993-09-05), pages 587-590, XP010128831 ISBN: 0-8186-5470-8 *
BORMAN S ET AL: "Simultaneous multi-frame MAP super-resolution video enhancement using spatio-temporal priors" IMAGE PROCESSING, 1999. ICIP 99. PROCEEDINGS. 1999 INTERNATIONAL CONFERENCE ON KOBE, JAPAN 24-28 OCT. 1999, PISCATAWAY, NJ, USA,IEEE, US, 24 October 1999 (1999-10-24), pages 469-473, XP010368835 ISBN: 0-7803-5467-2 *
BRAILEAN J C ET AL: "SIMULTANEOUS RECURSIVE DISPLACEMENT ESTIMATION AND RESTORATION OF NOISY-BLURRED IMAGE SEQUENCES" IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE INC. NEW YORK, US, vol. 4, no. 9, 1 September 1995 (1995-09-01), pages 1236-1251, XP000533956 ISSN: 1057-7149 *
HAAN DE G ET AL: "TRUE-MOTION ESTIMATION WITH 3-D RECURSIVE SEARCH BLOCK MATCHING" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE INC. NEW YORK, US, vol. 3, no. 5, 1 October 1993 (1993-10-01), pages 368-379, XP000414663 ISSN: 1051-8215 cited in the application *
KLEIHORST R P ET AL: "NOISE REDUCTION OF IMAGE SEQUENCES USING MOTION COMPENSATION AND SIGNAL DECOMPOSITION" IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE INC. NEW YORK, US, vol. 4, no. 3, 1 March 1995 (1995-03-01), pages 274-284, XP000501902 ISSN: 1057-7149 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004073313A1 (en) * 2003-02-13 2004-08-26 Koninklijke Philips Electronics N.V. Spatio-temporal up-conversion
WO2007089803A2 (en) * 2006-01-31 2007-08-09 Thomson Licensing Methods and apparatus for edge-based spatio-temporal filtering
WO2007089803A3 (en) * 2006-01-31 2007-10-11 Thomson Licensing Methods and apparatus for edge-based spatio-temporal filtering
US8135234B2 (en) 2006-01-31 2012-03-13 Thomson Licensing Method and apparatus for edge-based spatio-temporal filtering

Also Published As

Publication number Publication date
WO2002063562A3 (en) 2002-10-17
CN1460230A (en) 2003-12-03
KR20020087128A (en) 2002-11-21
EP1360648A2 (en) 2003-11-12
JP2004519145A (en) 2004-06-24

Similar Documents

Publication Publication Date Title
US6452639B1 (en) Raster scan conversion system for interpolating interlaced signals
Jeon et al. Coarse-to-fine frame interpolation for frame rate up-conversion using pyramid structure
US7034892B2 (en) Spatio-temporal filter unit and image display apparatus comprising such a spatio-temporal filter unit
KR970009469B1 (en) Interlace/sequential scan conversion apparatus and method for facilitating double smoothing function
US6377621B2 (en) Motion compensated interpolation
US7720150B2 (en) Pixel data selection device for motion compensated interpolation and method thereof
US6810081B2 (en) Method for improving accuracy of block based motion compensation
US20110255004A1 (en) High definition frame rate conversion
US7489350B2 (en) Unit for and method of sharpness enhancement
US6847406B2 (en) High quality, cost-effective film-to-video converter for high definition television
GB2337391A (en) Interlaced to progressive scanning conversion with edge enhancement by vertical temporal interpolation
JP5133038B2 (en) Image restoration method and image restoration apparatus
US7042945B2 (en) 3-D recursive vector estimation for video enhancement
GB2405047A (en) De-interlacing algorithm responsive to edge pattern
JPH01245684A (en) Picture information transmitting system
EP0801862B1 (en) Data filtering
WO2002063562A2 (en) 3-d recursive vector estimation for video enhancement
JP3745425B2 (en) Motion vector detection method and adaptive switching prefilter for motion vector detection
US20060038918A1 (en) Unit for and method of image conversion
JPH06334909A (en) Noise reducing device
JPH0447785A (en) Television receiver

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): CN JP KR

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

WWE Wipo information: entry into national phase

Ref document number: 2002716230

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 1020027013488

Country of ref document: KR

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): CN JP KR

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR

WWP Wipo information: published in national office

Ref document number: 1020027013488

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 028010655

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2002563430

Country of ref document: JP

WWP Wipo information: published in national office

Ref document number: 2002716230

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2002716230

Country of ref document: EP