CN106295587B

CN106295587B - A kind of video interested region quick calibrating method

Info

Publication number: CN106295587B
Application number: CN201610675724.5A
Authority: CN
Inventors: 李永溢; 徐家骏; 江周平
Original assignee: Interactive (beijing) Technology Co Ltd
Current assignee: Beijing Anxin Zhitong Technology Co.,Ltd.
Priority date: 2016-08-16
Filing date: 2016-08-16
Publication date: 2019-05-31
Anticipated expiration: 2036-08-16
Also published as: CN106295587A

Abstract

The invention discloses a kind of video interested region quick calibrating method and device, the method comprises determining that the target frame image in video data；Determine the first component image data, second component image data in target frame image；It calculates the threshold value TH of the first component image data and calculates the difference threshold DTH of the second component image data of the first component image data and X times；According to the value of the first component of pixel and the value of second component determine the macro block in face or skin area in each macro block in the threshold value TH, the difference threshold DTH, the target frame image, the area-of-interest by the set of the macro block as the target frame image.The present invention provides a kind of methods of accurate and simple and quick calibration video interested region, and human face region is demarcated in dynamic video image, is particularly suitable for the limited terminal of processor ability and using situation.

Description

A kind of video interested region quick calibrating method

Technical field

The present invention relates to a kind of video data handling procedure more particularly to a kind of video interested region quick calibrating methods With device.

Background technique

With internet, mobile Internet infrastructure constantly improve and the upgrading of cellular mobile communication networks is changed Generation, people use visual telephone or video calling as the hand of remote communication exchange more and more in life and work Section.But since network bandwidth is insufficient, network transmission load is excessive, video image resolution ratio is constantly promoted etc. it is many-sided because The influence of element, traditional video coding technique not can guarantee the video image that user obtains high quality.Video coding technique is passed through The development of many years how on the basis of metastable compression ratio, further obtains also gradually close to the limit of compression ratio and allows sight Higher-quality video image in the person's of seeing subjective perception is one of the difficult point of Current video compression technical field.

Color space is also referred to as colour model (also known as color space or color system), its purposes is used under certain standards Generally acceptable mode describes color.There are many kinds for color space, common to have RGB, YUV, HSI etc..RGB is (red green It is blue) it is the space defined according to the color of eye recognition, it can indicate most of color.But it is bright because RGB model is by tone Degree, saturation degree three are measured expression of putting together, are difficult to separate, it is difficult to digitized details adjustment is carried out, so in engineering and section In grinding, RGB model is not used directly generally, but be after other colour models such as YUV or HSI carry out again by RGB model conversion Continuous processing.

YUV colour model is that R, G of RGB model, B component are obtained luminance signal Y and two color difference by matrixing Signal B-Y (i.e. U), R-Y (i.e. V), last transmitting terminal encodes three signals of brightness and color difference respectively, with same channel It sends.Wherein " Y " indicates brightness (Luminance or Luma), that is, grayscale value；And " U " and " V " is indicated then It is coloration (Chrominance or Chroma), effect is that description colors of image and saturation degree are divided for the color of specified pixel It is not indicated with Cr and Cb.Wherein, Cr reflects the difference between RGB input signal RED sector and rgb signal brightness value, and What Cb reflected is the difference between RGB input signal blue portion and rgb signal brightness value.The luminance signal Y of YUV colour model It is separation with carrier chrominance signal U, V, is very suitable to the image procossings such as be enhanced, compressed.

HSI (Hue-Saturation-Intensity (Lightness), HSI or HSL) color model is joined with H, S, I tri- Number description color characteristics, wherein H defines the wavelength of color, referred to as tone；S indicates gradation of color, referred to as saturation degree；I Indicate intensity or brightness.When people observes a color body, the color of object is described with hue, saturation, intensity.Tone It is the attribute (gilvous, orange or red) for describing pure color；Saturation degree provides a kind of pure color by the degree of the diluted degree of white light Amount；Brightness is a subjective description, in fact, it cannot be measured, embodies colourless strength conception, and be to retouch State the key parameter of color sensation.And intensity (gray scale) is monochrome image the most useful description, this amount be can measure and It is easily explained.This model of proposition is then referred to as HSI (tone, saturation degree, intensity) color model, which can be in coloured silk From the inner influence for eliminating strength component of the colour information of carrying (tone and saturation degree) in chromatic graph picture, opened so that HSI model becomes The good tool of image processing method of the hair based on colour description, and this colored description is certainly however intuitive for people.

HSI model is U.S.'s chromatist's Munsell (H.A.Munseu) in proposition in 1915, it reflects the view of people Feel the mode of system senses colour, color is perceived with three kinds of tone, saturation degree and intensity essential characteristic amounts.

Area-of-interest (ROI, Range of Interesting) Video coding is cope with the above problem at present effective One of solution, since human eye has the characteristic of vision region of interest, i.e. human eye is interested to a sub-picture different zones Degree is different, this means that human eye perceives degree also corresponding difference, therefore, base to the mass loss of piece image different zones In the method for video coding of area-of-interest, this subjective characteristic of video image recipient can be made full use of, in image not Different coding strategies is used with region, the subjectivity of image etc. can be effectively obviously improved while not increasing encoder bit rate Experience quality.In order to realize this technology of interested area video coding, the coding and decoding video algorithm standard rules of current main-stream are mentioned Corresponding suggestion implementation is supplied, such as International Organization for standardization/International Electrotechnical Commission (ISO/IEC) moving image is special Family group -2 (MPEG-2) standard, the 10th partial higher video encoding standard of ISO/IEC mpeg-4 (MPEG-4) (AVC), International Telecommunication Union (ITU) H.264 standard, be each provided with obtained in image specific region it is more high-quality than other regions The mechanism of amount allows transmitting terminal to distinguish area-of-interest by analyzing the characteristic of image to be encoded, by area-of-interest More coded-bits are distributed to retain more details, to obtain higher perceived quality.

And problem is extracted for general ROI, popular method is to establish computable visual attention location model, this is a kind of It is related to vision physiological, cognitive psychological, memory mechanism, the complex model of image information etc. etc., it is difficult to quickly be marked Method is determined, especially for dynamic image.Other processing specifically from dynamic image demarcate human face region method because The property of privileged site in shape or face excessively to focus on face often introduces complicated algorithm, is not also suitable for The Fast Calibration of human face region in dynamic image.The software encoder such as OPENH264 encoder for meeting H264 specification, due to calculating Complexity of method itself can occupy a part of process resource of processor, thus the algorithm of wherein insertion human face region calibration must It must be simple and quick and effective, and since the basic unit of compression processing in H264 is macro block (MB), human face region Calibration should be the basic unit in the region with MB.

Demarcate about from image to human face region, there are many algorithms at present: Knowledge based engineering algorithm is utilized to face The knowledge specified rule of priori is identified；Algorithm based on feature first finds the invariant feature of face and then carries out in the picture This feature is verified in matching whether there is；There are also the algorithms based on statistics and probabilistic model.The colour of skin based on face various pieces Feature is split processing, and consider each section shape feature and mutual positional relationship feature, this is a kind In the algorithm based on feature, as other type algorithms, it also has sizable complexity, is not suitable for Fast Calibration.

The shortcomings that prior art, is mainly reflected in the complexity of calculating, and most of video related application is all dynamic vision Frequently, there is comparable requirement of real-time, on the other hand, the equipment for carrying Video Applications, for example, common mobile terminal, processor energy Power is limited, does not also allow too many computational complexity.

Summary of the invention

Processor energy is not suitable in order to solve video interested region calibration mode calculation complexity in the prior art The problem of limited terminal of power or situation, the present invention provides a kind of video interested region quick calibrating method and devices.

The present invention provides a kind of video interested region quick calibrating methods, comprising:

Determine the target frame image in video data；

Determine the first component image data, second component image data in target frame image；

It calculates the threshold value TH of the first component image data and calculates the of the first component image data and X times The difference threshold DTH of two component image data；

According in the threshold value TH, the difference threshold DTH, the target frame image first point of pixel in each macro block The value of amount and the value of second component determine the macro block in face or skin area, using the set of the macro block as the target The area-of-interest of frame image.

Above-mentioned video interested region quick calibrating method also has the following characteristics that

It is described according in the threshold value TH, the difference threshold DTH, the target frame image in each macro block pixel The value of one component and the value of second component determine that the macro block in face or skin area includes:

When the value of the threshold value TH is greater than or equal to the first TH value, the value of the first component in each pixel of each macro block is determined It is object pixel greater than the 2nd TH value and less than the pixel of the first TH value；

When the value of the threshold value TH is less than the first TH value and is greater than or equal to the 2nd TH value, each of each macro block is determined The value of the first component is greater than the 2nd TH value and is less than the value and second component of the first TH value and first component in pixel Value difference absolute value greater than the difference threshold DTH pixel be object pixel；

When the value of the threshold value TH is less than the 2nd TH value and is greater than or equal to the 3rd TH value, each of each macro block is determined The value of the first component is greater than the exhausted of the 3rd TH value and the difference of the value of the value and second component of first component in pixel Pixel to value greater than the difference threshold DTH is object pixel；

When the value of the threshold value TH is less than the 3rd TH value, determine that the value of the first component in each pixel of each macro block is big In the 3rd TH value and it is less than the pixel of the first TH value for object pixel；

Determine object pixel number in each macro block be greater than the first ratio of number of pixels in the macro block macro block be in The macro block of face or skin area；

Wherein, the first TH value, the 2nd TH value, the 3rd TH value successively reduce.

Following operation is executed for each target frame image:

The value of the threshold value TH determines each pixel of each macro block less than the first TH value and when being greater than or equal to the 2nd TH value In the value of the first component be greater than first threshold and be less than the first TH value and the value of first component and the value of second component Difference absolute value greater than the difference threshold DTH pixel be object pixel；First threshold is less than the 2nd TH value；

When the value of the threshold value TH is less than the 2nd TH value and is greater than or equal to the 3rd TH value, each of each macro block is determined The value of the first component is greater than the threshold value TH and is greater than second threshold and the value of the value and second component of the first component in pixel Pixel of the absolute value of difference greater than the difference threshold DTH is object pixel；

When the value of the threshold value TH is less than the 3rd TH value, determine that the value of the first component in each pixel of each macro block is big In the threshold value TH and it is greater than third threshold value and is less than the pixel of the first TH value for object pixel；

Wherein, the first TH value, the 2nd TH value, the 3rd TH value successively reduce；It is the first threshold, described Second threshold, the third threshold value successively reduce and are all larger than the 3rd TH value and be less than the 2nd TH value.

The first component image data are the concentration component image data in the target frame image, the second component Image data is the chromatic component image data in the target frame image；The value of the X be 1, the first TH value be 160, The 2nd TH value is 145, the 3rd TH value is 120.

The first component image data are the concentration component image data in the target frame image, the second component Image data is the chromatic component image data in the target frame image；The value of the X be 1, the first TH value be 160, The 2nd TH value is 145, the 3rd TH value is 120, the first threshold is 140, the second threshold be 135, it is described Third threshold value is 130.

The first component image data are the chrominance component image data in the target frame image, the second component Image data is the saturation degree component image data in the target frame image；The value of the X is π, and the first TH value is 1.2 π, the 2nd TH value are 1.1 π, the 3rd TH value is π, the first threshold is 1.05 π, the second threshold be 1.02 π, The third threshold value is 1.01 π.

First ratio is 1/10.

The threshold value TH for calculating the first component image data simultaneously calculates the first component image data and X times The difference threshold DTH of second component image data includes: to calculate the first component image number according to maximum between-cluster variance method According to threshold value TH and calculate the first component image data and X times second component image data difference threshold DTH.

Target frame image in the determining video data includes: to be chosen in the video data with same intervals frame number Target frame image；Alternatively, according to the motion information of macro block determines target frame image in consecutive frame in the video data.

The present invention also provides a kind of video interested region fast calibration devices, comprising:

Target frame image determining module, for determining the target frame image in video data；

Component image data determining module, for determining the first component image data, second component figure in target frame image As data；

Threshold calculation module, for calculating threshold value TH, the first component image number of the first component image data According to the difference threshold DTH of the second component image data with X times；

Area-of-interest determining module, for according to the threshold value TH, the difference threshold DTH, the target frame image In in each macro block the value of the first component of pixel and the value of second component determine the macro block in face or skin area, by institute State area-of-interest of the set of macro block as the target frame image.

The present invention provides a kind of methods of accurate and simple and quick calibration video interested region, in dynamic vision Human face region is demarcated in frequency, is particularly suitable for the limited terminal of processor ability and using situation.

Detailed description of the invention

The attached drawing for constituting a part of the invention is used to provide further understanding of the present invention, schematic reality of the invention It applies example and its explanation is used to explain the present invention, do not constitute improper limitations of the present invention.In the accompanying drawings:

Fig. 1 is the flow chart of video interested region quick calibrating method in embodiment；

Fig. 2 is the structure chart of video interested region fast calibration device in embodiment.

Specific embodiment

Technical solution of the present invention is further illustrated with reference to the accompanying drawings and specific embodiments of the specification.It should be appreciated that this Locate described specific embodiment to be used only for explaining the present invention, be not intended to limit the present invention.

Fig. 1 is the flow chart of video interested region quick calibrating method in embodiment, and the method includes:

Step 101, the target frame image in video data is determined；

Step 102, the first component image data, second component image data in target frame image are determined；

Step 103, threshold value TH, the first component image data and X times of the second component of the first component image data are calculated The difference threshold DTH of image data；

Step 104, according to the first component of pixel in each macro block in threshold value TH, difference threshold DTH, target frame image Value and the value of second component determine the macro block in face or skin area, by the set of the macro block as target frame image Area-of-interest.

Wherein, the mode for the target frame image in video data being determined in step 101 includes following two: with same intervals Target frame image in frame number selecting video data；Alternatively, according to the motion information of macro block determines in consecutive frame in video data Target frame image when the motion range of macro block is smaller in consecutive frame, determines target frame image with lesser interval frame number, in phase When the motion range of macro block is larger in adjacent frame, target frame image is determined with biggish interval frame number.The present invention is for processing capacity Poor equipment, it is easy to by abandoning the performance of sub-fraction human face region calibration, equivalent promotion calculation proposed by the present invention The speed of method, it is specific as follows: using the correlation between continuous picture frame, the frame number greater than 1 to can be used, such as with 2 or 3 Frame does face region labeling for the period rather than every 1 frame is all used as target frame, and the frame in the period thinks that human face region is constant.It is regarding In the case that the picture material variation of frequency stream is unhappy, the performance for hardly losing calibration is done so.

In step 103, the threshold value TH and the first component map of the first component image data are calculated according to maximum between-cluster variance As the threshold value DTH of data and the difference of X times of second component image data.

Below by using two of concentration component V and chromatic component U in video image to illustrate realization of the invention Method.When the picture format of target frame image in step 102 is not yuv format, need that target frame image is first converted to YUV figure As format executes operations described below again.

Implementation method one

The Fast Calibration of ROI is carried out in this implementation method two using yuv space.Wherein, the first component image data are mesh The concentration component V image data in frame image are marked, second component image data is the chromatic component U picture number in target frame image According to.

Following operation is executed for each target frame image:

When the value of TH is greater than or equal to the first TH value, determine that the value of the first component in each pixel of each macro block is greater than second TH value and less than the pixel of the first TH value be object pixel；

The value of TH determines in each pixel of each macro block first point less than the first TH value and when being greater than or equal to the 2nd TH value The value of amount is greater than the 2nd TH value and big less than the absolute value of the first TH value and the difference of the value of the value and second component of the first component In difference threshold DTH pixel be object pixel；

The value of TH determines in each pixel of each macro block first point less than the 2nd TH value and when being greater than or equal to the 3rd TH value The value of amount is greater than the 3rd TH value and the absolute value of the difference of the value of the value and second component of the first component is greater than difference threshold DTH's Pixel is object pixel；

When the value of TH is less than the 3rd TH value, determine the first component in each pixel of each macro block value be greater than the 3rd TH value and Pixel less than the first TH value is object pixel；

The macro block for determining that object pixel number is greater than the first ratio of number of pixels in macro block in each macro block is in face Or the macro block of skin area.

It is as follows for the specific example of the method one:

The threshold value of concentration component V image data in target frame image is V_TH, the first component V image data and second point The difference threshold for measuring U image data is DVU_TH.It is the 145, the 3rd TH value is 120 that first TH value, which is the 160, the 2nd TH value,.

Following operation is executed for each target frame image:

When V_TH>=160, determine that the pixel of V>145 and V<160 is object pixel in each pixel of each macro block；

V_TH>=145 and when V_TH<160, determine in each pixel of each macro block V>145 and V<160 and | V-U |> The pixel of DVU_TH is object pixel；

V_TH>=120 and when V_TH<145, determine in each pixel of each macro block V>120 and | V-U |>DVU_TH's Pixel is object pixel；

When V_TH<120, determine that the pixel of V>120 and V<160 is object pixel in each pixel of each macro block.

Implementation method two

Following operation is executed for each target frame image:

When the value of threshold value TH is greater than or equal to the first TH value, determine that the value of the first component in each pixel of each macro block is greater than 2nd TH value and less than the pixel of the first TH value be object pixel；

The value of threshold value TH is less than the first TH value and when being greater than or equal to the 2nd TH value, determines in each pixel of each macro block the The value of one component is greater than first threshold and absolute less than the difference of the first TH value and the value of the value and second component of the first component Pixel of the value greater than difference threshold DTH is object pixel；First threshold is less than the 2nd TH value；

The value of threshold value TH is less than the 2nd TH value and when being greater than or equal to the 3rd TH value, determines in each pixel of each macro block the The value of one component is greater than threshold value TH and is greater than the absolute value of second threshold and the difference of the value of the value and second component of the first component Pixel greater than difference threshold DTH is object pixel；

When the value of threshold value TH is less than the 3rd TH value, determine that the value of the first component in each pixel of each macro block is greater than threshold Value TH and be greater than third threshold value and less than the pixel of the first TH value be object pixel；

Wherein, the first TH value, the 2nd TH value, the 3rd TH value successively reduce.First threshold, second threshold, third threshold value according to Secondary reduction and it is all larger than the 3rd TH value and less than the 2nd TH value.

It is as follows for the specific example of the method two:

The threshold value of concentration component V image data in target frame image is V_TH, the first component V image data and second point The difference threshold for measuring U image data is DVU_TH.It is the 145, the 3rd TH value is 120, first that first TH value, which is the 160, the 2nd TH value, Threshold value is 140, second threshold 135, third threshold value are 130.

Following operation is executed for each target frame image:

V_TH>=145 and when V_TH<160, determine in each pixel of each macro block V>140 and V<160 and | V-U |> The pixel of DVU_TH is object pixel；

V_TH>=120 and when V_TH<145, determine in each pixel of each macro block V>V_TH and V>135 and | V-U |> The pixel of DVU_TH is object pixel；

When V_TH<120, determine that the pixel of V>V_TH and V>130 and V<160 is target picture in each pixel of each macro block Element.

Implementation method three

The Fast Calibration of ROI is carried out in this implementation method three using the space HSI.Wherein, the first component image data are mesh The chrominance component H in frame image is marked, second component image data is the saturation degree component S in target frame image.

Following operation is executed for each target frame image:

When the value of threshold value TH is greater than or equal to the first TH value, determine that the value of the first component in each pixel of each macro block is greater than First TH value and less than the pixel of the first TH value be object pixel；

The value of threshold value TH is less than the first TH value and when being greater than or equal to the 2nd TH value, determines in each pixel of each macro block the The value of one component is greater than first threshold and absolute less than the difference of the first TH value and the value of the value and second component of the first component Pixel of the value greater than threshold value DTH is object pixel；First threshold is less than the 2nd TH value；

The value of threshold value TH is less than the 2nd TH value and when being greater than or equal to the 3rd TH value, determines in each pixel of each macro block the The value of one component is greater than threshold value TH value and is greater than the absolute of second threshold and the difference of the value of the value and second component of the first component Pixel of the value greater than difference binarization threshold DTH is object pixel；

When the value of threshold value TH is less than the 3rd TH value, determine that the value of the first component in each pixel of each macro block is greater than threshold value TH It is worth and is greater than third threshold value and is object pixel less than the pixel of the first TH value；

Wherein, the first TH value, the 2nd TH value, the 3rd TH value successively reduce；

First threshold, second threshold, third threshold value successively reduce.

It is as follows for the specific example of this implementation method three:

The threshold value of chrominance component H image data in target frame image is H_TH, the first component H image data and second point The threshold value for measuring π times of difference of S image data is DHS_TH.First TH value is 1.2 π, the 2nd TH value is 1.1 π, the 3rd TH value It is 1.05 π for π, first threshold, second threshold is 1.02 π, third threshold value is 1.01 π.

Following operation is executed for each target frame image:

When H_TH>=1.2* π, determine that the pixel of the π of H>1.1 and the π of H<1.2 are object pixel in each pixel of each macro block；

The π of H_TH>=1.1 and when H_TH<1.2 π, determine in each pixel of each macro block the π of H>1.1 and the π of H<1.2 and | H- S* π | the pixel of > DHS_TH is object pixel；

H_TH>=π and when H_TH<1.1 π, determine in each pixel of each macro block H>H_TH and the π of H>1.02 and | H-S* π | the pixel of > DHS_TH is object pixel；

When H_TH<π, determine that the pixel of V>H_TH and the π of V>1.01 and the π of V<1.2 are target picture in each pixel of each macro block Element.

In the method, the first ratio is preferably but not limited to 1/10, and macro block determines target picture in each macro block when being 16X16 macro block Macro block of the plain number greater than 16 is the macro block in face or skin area.

The carry out human face region calibration in the picture of complexion model and maximum variance between clusters (OSTU) is based in this method Fast method, different from the method being split based on face various pieces features of skin colors, it is without the concern for geometry And the positional relationship between these shapes, and several threshold values only are obtained after simple statistics calculates, then pass through inspection Look into the relationship between the UV value of the pixel in MB and these threshold values, can one MB of fairly accurate calibration whether in human face region Or in the skin area of exposure.

The present invention is not limited to the mode of the HS component of the mode and HIS space using the UV component of yuv space, can be with It is the component in rgb space, can also be the component in other picture formats or other color spaces, use other picture formats Or other color spaces component when respective threshold need to be changed accordingly according to picture format or color space.

The present invention also provides the video interested region of the video interested region quick calibrating method corresponding to Fig. 1 is fast Speed variator.Fig. 2 is video interested region fast calibration device in embodiment, with reference to Fig. 2, video interested region Fast Calibration Device includes:

Target frame image determining module 201, for determining the target frame image in video data；

Component image data determining module 202, for determining the first component image data, second component in target frame image Image data；

Threshold calculation module 203, for calculating threshold value TH, the first component image data and the X of the first component image data The difference threshold DTH of second component image data again；

Area-of-interest determining module 204, for according to each macro block in threshold value TH, difference threshold DTH, target frame image The value of first component of middle pixel and the value of second component determine the macro block in face or skin area, by the collection of the macro block Cooperation is the area-of-interest of target frame image.

The specific implementation of the function of above-mentioned each module is identical as mode described in the above method, is not repeated herein Description.

Method of the invention is simple and quick and accurate and effective, can be all with software realization, and implementation when is very flexible, several There is no cost, is very suitable for using in video image processing especially internet video application: once human face region quilt Calibration, the subsequent specially treated for the region are known as possibility.Such as: when using H.264 encoding, face area can be promoted The quality in domain, while the quality of the appropriate other parts for reducing image, it is overall to keep encoding rate constant, in this way, even if in network The figure that in the case where Bandwidth-Constrained, after compressed bit stream is decoded, can also obtain face clearly image, rather than obscure all over Picture.

Those of ordinary skill in the art will appreciate that all or part of the steps in the above method can be instructed by program Related hardware is completed, and described program can store in computer readable storage medium, such as read-only memory, disk or CD Deng.Optionally, one or more integrated circuits also can be used to realize, accordingly in all or part of the steps of above-described embodiment Ground, each module/unit in above-described embodiment can take the form of hardware realization, can also use the shape of software function module Formula is realized.The present invention is not limited to the combinations of the hardware and software of any particular form.

Descriptions above can combine implementation individually or in various ways, and these variants all exist Within protection scope of the present invention.

It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that including the article of a series of elements or equipment not only includes those elements, but also including not having There is the other element being expressly recited, or further includes for this article or the intrinsic element of equipment.Do not limiting more In the case where system, the element that is limited by sentence " including ... ", it is not excluded that in the article or equipment for including the element There is also other identical elements.

The above examples are only used to illustrate the technical scheme of the present invention and are not limiting, reference only to preferred embodiment to this hair It is bright to be described in detail.Those skilled in the art should understand that can modify to technical solution of the present invention Or equivalent replacement should all cover in claim model of the invention without departing from the spirit and scope of the technical solution of the present invention In enclosing.

Claims

1. a kind of video interested region quick calibrating method characterized by comprising

Determine the target frame image in video data；

It calculates the threshold value TH of the first component image data and calculates the first component image data and X times of second component The difference threshold DTH of image data；

According to the first component of pixel in each macro block in the threshold value TH, the difference threshold DTH, the target frame image Value and the value of second component determine the macro block in face or skin area, using the set of the macro block as the target frame figure The area-of-interest of picture；

It is described according in the threshold value TH, the difference threshold DTH, the target frame image first point of pixel in each macro block The value of amount and the value of second component determine that the macro block in face or skin area includes:

When the value of the threshold value TH is greater than or equal to the first TH value, determine that the value of the first component in each pixel of each macro block is greater than 2nd TH value and less than the pixel of the first TH value be object pixel；

When the value of the threshold value TH is less than the first TH value and is greater than or equal to the 2nd TH value, each pixel of each macro block is determined In the value of the first component be greater than the 2nd TH value and be less than the first TH value and the value of first component and the value of second component Difference absolute value greater than the difference threshold DTH pixel be object pixel；

When the value of the threshold value TH is less than the 2nd TH value and is greater than or equal to the 3rd TH value, each pixel of each macro block is determined In the value of the first component be greater than the absolute value of the 3rd TH value and the difference of the value of the value and second component of first component Pixel greater than the difference threshold DTH is object pixel；

When the value of the threshold value TH is less than the 3rd TH value, determine the value of the first component in each pixel of each macro block greater than the Three TH values and the pixel for being less than the first TH value are object pixel；

The macro block for determining that object pixel number is greater than the first ratio of number of pixels in the macro block in each macro block is in face Or the macro block of skin area；

Alternatively,

Following operation is executed for each target frame image:

The value of the threshold value TH is less than the first TH value and when being greater than or equal to the 2nd TH value, determines in each pixel of each macro block the The value of one component is greater than first threshold and is less than the difference of the first TH value and the value of the value and second component of first component Pixel of the absolute value of value greater than the difference threshold DTH is object pixel；First threshold is less than the 2nd TH value；

When the value of the threshold value TH is less than the 2nd TH value and is greater than or equal to the 3rd TH value, each pixel of each macro block is determined In the value of the first component be greater than the threshold value TH and be greater than the difference of second threshold and the value of the value and second component of the first component Absolute value greater than the difference threshold DTH pixel be object pixel；

When the value of the threshold value TH is less than the 3rd TH value, determine that the value of the first component in each pixel of each macro block is greater than institute Stating threshold value TH and being greater than third threshold value and be less than the pixel of the first TH value is object pixel；

Wherein, the first TH value, the 2nd TH value, the 3rd TH value successively reduce；The first threshold, described second Threshold value, the third threshold value successively reduce and are all larger than the 3rd TH value and be less than the 2nd TH value.

2. video interested region quick calibrating method as described in claim 1, which is characterized in that

The first component image data are the concentration component image data in the target frame image, the second component image Data are the chromatic component image datas in the target frame image；The value of the X be 1, the first TH value be 160, it is described 2nd TH value is 145, the 3rd TH value is 120.

3. video interested region quick calibrating method as described in claim 1, which is characterized in that

The first component image data are the concentration component image data in the target frame image, the second component image Data are the chromatic component image datas in the target frame image；The value of the X be 1, the first TH value be 160, it is described 2nd TH value is 145, the 3rd TH value is 120, the first threshold is 140, and the second threshold is 135, the third Threshold value is 130.

4. video interested region quick calibrating method as described in claim 1, which is characterized in that

The first component image data are the chrominance component image data in the target frame image, the second component image Data are the saturation degree component image data in the target frame image；The value of the X is π, and the first TH value is 1.2 π, institute State that the 2nd TH value is 1.1 π, the 3rd TH value is π, the first threshold is 1.05 π, the second threshold is 1.02 π, described Third threshold value is 1.01 π.

5. video interested region quick calibrating method as described in claim 1, which is characterized in that

First ratio is 1/10.

6. video interested region quick calibrating method as described in claim 1, which is characterized in that

The threshold value TH for calculating the first component image data simultaneously calculates the first component image data and the second of X times The difference threshold DTH of component image data includes: to calculate the first component image data according to maximum between-cluster variance method Threshold value TH and calculate the first component image data and X times second component image data difference threshold DTH.

7. video interested region quick calibrating method as described in claim 1, which is characterized in that

Target frame image in the determining video data includes: the target chosen in the video data with same intervals frame number Frame image；Alternatively, according to the motion information of macro block determines target frame image in consecutive frame in the video data.