CN103208110B - The conversion method and device of video image - Google Patents

The conversion method and device of video image Download PDF

Info

Publication number
CN103208110B
CN103208110B CN201210013123.XA CN201210013123A CN103208110B CN 103208110 B CN103208110 B CN 103208110B CN 201210013123 A CN201210013123 A CN 201210013123A CN 103208110 B CN103208110 B CN 103208110B
Authority
CN
China
Prior art keywords
image
pretreatment
present frame
video
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210013123.XA
Other languages
Chinese (zh)
Other versions
CN103208110A (en
Inventor
刘立峰
林福辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN201210013123.XA priority Critical patent/CN103208110B/en
Publication of CN103208110A publication Critical patent/CN103208110A/en
Application granted granted Critical
Publication of CN103208110B publication Critical patent/CN103208110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)

Abstract

A kind of conversion method and device of video image.The conversion method of the video image includes:Pretreatment is carried out to the current frame image of two-dimensional video and obtains present frame pretreatment image, the pretreatment includes:Remove the global motion of current frame image;Extract the depth image of the present frame pretreatment image;The left-eye image and eye image of current frame image are obtained based on the present frame pretreatment image and its depth image.Technical scheme of the present invention, it is realizing while two-dimensional video is converted to 3 D video on mobile phone, but also user can be along any direction cell phone during making 3D videos by the mobile phone with single camera, and multi-pass operation need not be carried out and be obtained with the preferable 3D videos of effect, it provide the user great convenience.

Description

The conversion method and device of video image
Technical field
The present invention relates to image technique field more particularly to the conversion methods and device of a kind of video image.
Background technology
With the fast development of computer technology and the communication technology, multimedia video application has obtained more and more extensive push away Extensively, such as:Multimedia video broadcasting, DTV, video calling, pass through cell-phone camera etc..However, traditional two dimension (2D) regards The display effect of frequency lacks stereovision and three-dimensional sense, cannot meet requirement of the people to video image realism, and people need There is feeling on the spot in person when watching video image, therefore, three-dimensional (3D) video technique comes into being.
The generation of 3D videos is based primarily upon two kinds of realization methods:One is 3D videos are directly generated using stereoscopic camera, adopt The 3D video stereoscopic effects shot with which are strong, and visual effect is true, however cost of manufacture is quite high.It is another then Be by certain algorithm by existing 2D Video Quality Metrics it is 3D videos, it specifically, can be by by scape present in 2D videos Deeply convince that breath extracts, by 2D Video Quality Metrics is 3D videos using depth of view information, and is shown, used by 3D video display terminals The 3D videos which generates, it is of low cost, become a main direction of development for generating 3D videos.
Currently, digital TV field has been realized in common single channel 2D Video Quality Metrics into 3D videos, as shown in Figure 1, Fig. 1 is to be applied to the 2D videos of DTV to the converting system of 3D videos, including two basic modules:Depth image estimates mould Block 1 and 3D rendering rendering module 2.Wherein, depth image estimation module 1 includes:Motion estimation unit 10, color images list Member 11 and integrated unit 12;3D rendering rendering module 2 includes:3D rendering map unit 20 and empty fills unit 21.
Motion estimation unit 10 is used to obtain the motion vector field (movable information) of single channel 2D videos, color images list Member 11 is used to single channel 2D videos be each region according to color segmentation, and integrated unit 12 is used to obtain motion estimation unit 10 Motion vector field and the segmentation that obtains of color images unit 11 after each region color information merge it is final The depth image of single channel 2D videos is obtained, 3D rendering map unit 20 maps the depth image of acquisition, and passes through cavity Fills unit 21 carries out cavity filling to the depth image after mapping and then obtains left-eye image.Using single channel 2D videos as right eye Image, then will finally obtain left-eye image with certain parallax and eye image can be shown by 3D video display systems 3D videos are shown.
Other modes can also be used for the acquisition of the depth image of above-mentioned 2D video images, such as:Utilize tool There is the video camera of depth of field abstraction function to obtain depth image, can be obtained using the methods of infrared or structure light specifically The video camera of depth image;Two-way video can also be utilized to obtain depth image.
It is existing it is a kind of using two-way video obtain depth image as shown in Fig. 2, its be using mixed iteration matching (HRM, Hybird Recrusive Matching) algorithm extraction depth image flow diagram.As shown in Fig. 2, being with left-eye image Example, first corrects left-eye image, and anti-distortion model method may be used and carry out, mixed repeatedly to the left-eye image after correction Left-eye image after mixed iteration matches is carried out consistency check to remove incredible motion vector by generation matching.It is right Left-eye image carries out image segmentation after correction, and to the left-eye image after segmentation, the left-eye image after correction and by consistent Property examine left-eye image carry out the post-processing based on region, with further increase depth image generation precision, finally to point The left-eye image after left-eye image, correction after cutting and the left-eye image by the post-processing acquisition based on region are based on The interpolation in region is to obtain left eye depth image.It is similar for the acquisition and the acquisition of left eye depth image of right eye depth image Seemingly, thus not reinflated specific detailed description.
With the development of 3D video techniques and mobile phone technique, the especially display technology of 3D videos, experienced from traditional The red blue auto-stereoscopic display technology for needing anaglyph spectacles, arrives shutter stereo display technique, bore hole 3D display skill till now Art (does not need anaglyph spectacles), and corresponding technical foundation is provided to the broadcasting and display of mobile phone 3D videos.
For most 3D mobile phones, the 3D videos made can only be played, and user by network or The video file that other channels of person obtain, such as:The video images such as film, television video frequency signal still fall within 2D videos, therefore, The range for the 3D videos that people can watch largely is limited, and then also limits 3D technology in mobile phone application aspect It promotes.
In addition, for mobile phone user, also have and oneself make the demand of 3D videos, for now, user can be with There are two the mobile phones of camera to carry out the making of 3D videos by installing, and the mobile phone is using taking the photograph mounted on the multiple of mobile phone side The shooting for have the multi-channel video of parallax as head (generally two) or the eyes for simulating people, which obtain, has Binocular vison The stereo-picture of difference, but on the one hand this mode increases the cost of mobile phone hardware, on the other hand also increase the body of mobile phone Product and power consumption, it is difficult to promote to lead to it on the market.In addition, user can also use be equipped with the mobile phone of single camera into The making of row 3D videos obtains several images with parallax by cell phone, and mobile phone is sensed by its included posture The method of device or image procossing removes mobile phone rotation pair to obtain the information of mobile phone rotation using the information of mobile phone rotation Then the influence of image selects suitable two images to be sent into 3d display systems respectively as left eye and eye image, and then complete At the making of 3D videos.But the making that 3D videos are carried out using single camera is had some limitations, such as:Utilize list One camera can only obtain still image, and need user to hold mobile phone in the acquisition process of still image and carry out specific direction With the movement of range, and multi-pass operation is carried out, gives user to make 3D video tape and carry out greatly inconvenient, and 3D videos made It is ineffective.
The converting system of 2D videos applied to DTV mentioned above to 3D videos can't be applied to hand at present Therefore how 2D Video Quality Metrics are 3D videos by mobile phone, or by mobile phone user is facilitated by machine platform The preferable 3D videos of effect are produced as one of current urgent problem to be solved in ground.
Other related two-dimensional videos are converted to the technology of 3 D video referring also to Publication No. US2011018873A1, Entitled Two-dimensional to three-dimensional imageconversion system and The U.S. Patent application of method.
Invention content
What the present invention solved is that the method that two-dimensional video is converted into 3 D video is applied to the three of mobile phone in the prior art Tie up the ineffective problem of display.
To solve the above-mentioned problems, the present invention provides a kind of conversion methods of video image, including:
Pretreatment is carried out to the current frame image of two-dimensional video and obtains present frame pretreatment image, the pretreatment includes: Remove the global motion of current frame image;
Extract the depth image of the present frame pretreatment image;
The left-eye image and right eye figure of current frame image are obtained based on the present frame pretreatment image and its depth image Picture.
Optionally, the global motion of the removal current frame image includes:
Obtain global motion vector field and the global rotational motion vector field of current frame image;
Global motion vector field based on the current frame image and global rotational motion vector field to current frame image into Row global translation and global rotation.
Optionally, described pre-process further includes:It is removed before the global motion for removing the current frame image described current The distortion of frame image.
Optionally, the distortion of the current frame image is removed using barrel distortion model.
Optionally, the depth image of the extraction present frame pretreatment image includes:
Remove the incredible pixel of motion vector in the present frame pretreatment image;
Obtain the color area segmentation information of the present frame pretreatment image;
It is at least pre- to the present frame for eliminating the incredible pixel of motion vector based on the color area segmentation information Cavity in processing image is filled, and obtains the depth image of present frame pretreatment image.
Optionally, the incredible pixel of motion vector includes in the removal present frame pretreatment image:
Present frame pretreatment image is matched with previous frame image, obtain present frame pretreatment image pixel with The corresponding positive matching value of its match point and negative relational matching value;
The present frame pretreatment image is divided into image block with predetermined size;
It removes the error of positive matching value and negative relational matching value described in the present frame pretreatment image and is more than first When the pixel of limit value and the smoothness of described image block are more than the second threshold value, described image pixel in the block.
Optionally, described that present frame pretreatment image with previous frame image match using block matching method, light stream One kind in method and mixed iteration matching method.
Optionally, described to be at least based on the color area segmentation information to eliminating the incredible pixel of motion vector Present frame pretreatment image in cavity be filled including:In conjunction with the present frame of the color area segmentation information and caching The depth information of the depth image of at least previous frame image of image determines the depth information of the pixel in the cavity, with to institute Cavity is stated to be filled.
Optionally, the depth image of the present frame pretreatment image is normalization and filtered present frame pretreatment figure The depth image of picture.
Optionally, described one kind being filtered into smothing filtering, medium filtering and bilateral filtering.
Optionally, the left eye figure that current frame image is obtained based on the present frame pretreatment image and its depth image Picture and eye image include:
Using the present frame pretreatment image as eye image or left-eye image;
The depth image of the present frame pretreatment image is mapped, to obtain mapping image;
Empty filling is carried out to the mapping image, obtains corresponding another eye pattern picture.
Optionally, the depth image to the present frame pretreatment image, which map, includes:To the present frame The depth image of pretreatment image is mapped again after carrying out low-pass filtering.
Optionally, described to include to mapping image progress cavity filling:Linear filtering is carried out to the mapping image.
Optionally, the conversion method of the video image further includes:To the left eye figure of each frame image of the two-dimensional video Picture and eye image carry out Video coding, to obtain 3 D video code stream.
Optionally, the conversion method of the video image further includes:To the left-eye image and eye image of the present frame Compression of images is carried out, to obtain 3-D view.
To solve the above problems, the present invention also provides the conversion equipments of video image, including:
Pretreatment unit carries out pretreatment suitable for the current frame image to two-dimensional video and obtains present frame pretreatment image, The pretreatment includes:Remove the global motion of current frame image;
Depth image extraction unit is suitable for extracting the depth image of the present frame pretreatment image;
Deep image rendering unit is suitable for obtaining present frame figure based on the present frame pretreatment image and its depth image The left-eye image and eye image of picture.
Compared with prior art, technical scheme of the present invention has the following advantages:
Pretreatment is first carried out by the current frame image to the two-dimensional video and obtains present frame pretreatment image, is then obtained Take the depth image of the present frame pretreatment image, the depth image that is finally based on the present frame pretreatment image maps out Left-eye image or eye image, and using the present frame pretreatment image as another eye pattern picture, it will on mobile phone realizing While two-dimensional video is converted to 3 D video, but also user is making 3D videos by the mobile phone with single camera In the process can be along any direction cell phone, and multi-pass operation need not be carried out and be obtained with the preferable 3D videos of effect, It provide the user great convenience.
The distortion of the current frame image of two-dimensional video is eliminated by pretreated mode and global motion obtains currently The pretreatment image of frame image is removed the incredible pixel of motion vector in present frame pretreatment image, and is based on The color area segmentation information of the present frame pretreatment image is to eliminating the present frame of the incredible pixel of motion vector Cavity in pretreatment image is filled, and then obtains depth image, relative to what is directly obtained by current frame image Its accuracy is high for depth image, therefore also improves the quality of the 3D videos finally obtained.
Further, after the incredible pixel of motion vector in eliminating present frame pretreatment image, using slow The depth information of the depth image of at least previous frame image for the current frame image deposited and the color area of the current frame image Segmentation information is filled the large area cavity in present frame pretreatment image, further increases the present frame of acquisition The accuracy of the depth image of pretreatment image.
It is normalized and is filtered by the depth image to present frame pretreatment image, further improve The accuracy of the depth image of present frame pretreatment image, and then improve the quality of 3D videos.
Carrying out mapping to the depth image of the present frame pretreatment image includes:To the present frame pretreatment image Depth image is mapped again after carrying out low-pass filtering so that the edge of the depth image of the present frame pretreatment image is flat It is sliding, reduce the generation in cavity when mapping the depth image of the present frame pretreatment image, improves final acquisition The left eye of current frame image or the quality of eye image, and then also improve the quality of 3D videos.
Description of the drawings
Fig. 1 is to be applied to the 2D videos of DTV to the converting system of 3D videos;
Fig. 2 is the flow diagram using mixed iteration matching algorithm extraction depth image;
Fig. 3 is the flow diagram of the conversion method of the video image of the embodiment of the present invention;
Fig. 4 is barrel lens distortion models;
Fig. 5 is the schematic diagram of depth image extraction;
Fig. 6 is the flow diagram of the depth image of the extraction present frame pretreatment image of the embodiment of the present invention;
Fig. 7 is the structural schematic diagram of the conversion equipment of the video image of the embodiment of the present invention;
Fig. 8 is the structural schematic diagram of the conversion equipment of the video image of another embodiment of the present invention.
Specific implementation mode
To make the above purposes, features and advantages of the invention more obvious and understandable, below in conjunction with the accompanying drawings to the present invention Specific implementation mode be described in detail.
Detail is elaborated in the following description in order to fully understand the present invention.But the present invention can with it is a variety of not Other manner described here is same as to implement, those skilled in the art can do class without violating the connotation of the present invention Like popularization.Therefore the present invention is not limited by following public specific implementation mode.
Just as described in the background section, in the prior art, mobile phone can only play out 3D videos, and most of at present Video source or image source are 2D, when using existing, the mobile phone with single camera makes 3D videos in addition, effect It is bad and have certain limitation.
Inventor has found that extraction image depth information regards caused by depending on video camera translational motion itself Difference, but if video camera itself has rotary motion, it will cause occur global translation or rotary motion in video scene Vector field, however due to the converting system of the current 2D videos applied to DTV to 3D videos and 2D videos can not be removed Global motion in image then easily leads to transformed 3D videos so if applying on mobile terminal (such as mobile phone) Poor display effect, and when there is the region in the region, smooth featureless blocked in video image, obtained based on above system The effect of depth image and bad.
In addition, when carrying out the making of 3D videos using the camera of mobile phone, since mobile phone camera bore generally compares It is small, therefore, the pattern distortion obtained by mobile phone camera than more serious, and above system be also not directed to the image of distortion into Row correction.
Therefore, it is flat can not to directly apply to mobile phone for the converting system applied to the 2D videos of DTV to 3D videos Platform.Thus inventor proposes, is first pre-processed to 2D videos, to remove the distortion of the 2D video images caused by camera With rotated due to camera caused by 2D video images global motion, then to by the screening in pretreated 2D video images The motion vector of the pixel of gear region and smooth region (being difficult to detect the region of feature) is detected, and removes motion vector Cavity in incredible pixel, then 2D video images to eliminating the incredible pixel of motion vector is filled, To obtain the relatively good depth image of effect.
Technical solution for a better understanding of the present invention first explains the related terms occurred in the present invention:
Match point:Pixel in the (i-1)-th frame image corresponding with the pixel in the i-th frame image.
Match block:Image block in the (i-1)-th frame image corresponding with the image block in the i-th frame image.
Motion vector:The relative displacement of the corresponding match point of pixel in i-th frame image.
Positive matching value:The corresponding positive movement vector of the matched point of pixel in i-th frame image.
Negative relational matching value:The corresponding counter motion vector of the matched point of pixel in i-th frame image.
Fig. 3 is referred to, Fig. 3 is the flow diagram of the conversion method of the video image of the embodiment of the present invention;Such as Fig. 3 institutes Show, the conversion method of the video image includes:
Step S11:Pretreatment is carried out to the current frame image of two-dimensional video and obtains present frame pretreatment image, the pre- place Reason includes:Remove the global motion of current frame image.
Step S12:Extract the depth image of the present frame pretreatment image.
Step S13:The left-eye image of current frame image is obtained based on the present frame pretreatment image and its depth image And eye image.
Step S11 is executed, in the present embodiment, the 2D videos can be that user is obtained by network or other modes The 2D videos that 2D video code flows obtain after video decodes can also be the 2D videos that user is shot by the camera of mobile phone. The 2D videos are pre-processed, specifically, exactly each frame image in 2D videos is pre-processed.The pretreatment Including:The global motion of current frame image is removed, the global motion refers to the video motion caused by the rotation by camera, The background of whole image and target are all moving at this time.
It should be noted that for the 2D videos obtained through the above way, if the 2D video images are gone In addition to distortion, then the global motion for removing 2D video images is only needed both to be needed if the 2D video images do not remove distortion The distortion for removing the 2D video images is also required to remove the global motion of the 2D video images, specifically, in removal institute State the distortion that the current frame image is removed before the global motion of current frame image.
In the present embodiment, does not remove distortion with current frame image and illustrated accordingly.Remove the distortion of current frame image Lens calibration method, anti-distortion model method etc. may be used, the global motion for removing current frame image is primarily referred to as removal due to taking the photograph The global motion as caused by head rotation, may be used optical flow method etc., using which kind of method depending on actual demand.
Specifically, the 2D video images of distortion are carried out using barrel-shaped (barrel) lens distortion model in the present embodiment Correction, refers to Fig. 4, Fig. 4 is barrel lens distortion models, wherein figure (a) indicates that the image not distorted, figure (b) indicate abnormal The image of change, for pixel in image, corresponding distortion formula is:
Wherein:ruRefer to position of the pixel apart from center of distortion in non-fault image, rdRefer in fault image, as Position of the vegetarian refreshments apart from center of distortion, k is barrel distortion coefficient, depending on the optical characteristics of camera lens.And the purpose of distortion correction It is then:Determine that each pixel corresponds to the position in figure (b) in figure (a), i.e.,:
rd=F (k, ru) (2)
Wherein F is the inverse operation of formula (1), is calculated for convenience in practical calculating process, it will usually by above-mentioned formula (2) It is converted, obtains following formula:
After corresponding position of the position of each pixel in figure (a) in figure (b) is determined by above-mentioned formula, it will scheme (b) pixel value of the pixel value of corresponding position as the pixel in figure (a) corresponding with the position in, you can reconstruct is not abnormal The image of change.
After the distortion for eliminating current frame image by the above method, the global motion of current frame image is then removed, Specifically, it can carry out in the following way:
Global motion vector field and the global rotational motion vector field for first obtaining current frame image, are then based on the institute of acquisition The global motion vector field and global rotational motion vector field for stating current frame image carry out global translation and complete to current frame image Office's rotation.
Those skilled in the art know, are sweared to the global motion vector field and global rotary motion for obtaining current frame image Field is measured, then needs first to obtain the global motion vector of each pixel in current frame image and global rotational motion vector, currently The global motion vector of all pixels point constitutes global motion vector field, the global rotation fortune of all pixels point in frame image Dynamic vector constitutes global rotational motion vector field.And the global motion vector field of the current frame image based on acquisition and Global rotational motion vector field carries out global translation and global rotation, and the first global fortune based on pixel to current frame image Dynamic vector and global rotational motion vector to carry out global translation to pixel and the overall situation rotates, then by all pixels Point carries out global translation and global rotation and then realizes the global translation to current frame image and global rotation.Below to obtaining picture The global motion vector of vegetarian refreshments and global rotational motion vector are described in detail.
In the present embodiment, global motion vector can be divided into horizontal global motion vector and vertical global vector, right For the mobile phone with attitude transducer (gyroscope), the level of pixel is global in the current frame image of camera shooting The angle, θ that motion vector can relative to horizontal direction be rotated with mobile phone camerahIt weighs, vertical global vector can be with The angle, θ rotated relative to vertical direction with mobile phone cameravIt weighs, the angle, θh、θvIt detects to obtain by gyroscope. The global rotational motion vector of pixel can be with mobile phone camera along the angle, θ of itself optical axis rotationoTo weigh.
If mobile phone does not have attitude transducer, can be obtained by the method for image procossing the global motion vector and Global rotational motion vector obtains the global motion vector of pixel especially by optical flow method in the present embodiment and global rotates Motion vector.For optical flow method, brightness shape constancy is primarily based on it is assumed that i.e.:
IxU (x, y)+IyV (x, y)+It=0
Wherein:IxIt is pixel brightness to the derivative of x, IyIt is pixel brightness to the derivative of y, ItIt is pixel brightness to t Derivative, u (x, y) be the corresponding motion vector of pixel (x, y) the directions x component, v (x, y) be pixel (x, y) correspondence Motion vector the directions y component.
It is then based on light stream smoothness assumption, i.e., the pixel of each neighborhood moves in an identical manner in image, then The corresponding motion vector of pixel (x, y) the component u (x, y) in the directions x and the component v (x, y) in the directions y pass through following formula It obtains as follows:
Wherein:Ix1It is first pixel brightness to the derivative of x, IxnIt is nth pixel point brightness to the derivative of x;Iy1For First pixel brightness is to the derivative of y, IynIt is nth pixel point brightness to the derivative of y;It1For first pixel brightness To the derivative of t, ItnIt is nth pixel point brightness to the derivative of t, the pixel sum of 1≤n≤current frame image.
In the present embodiment, all pixels point in current frame image can not be calculated to accelerate calculating speed, But according to certain spacing, such as:A pixel is taken every 10 pixels in the horizontal direction and the vertical direction, then Substitute into formula (3) come obtain the corresponding motion vector of pixel (x, y) the directions x component u (x, y) and pixel (x, y) Component v (x, y) of the corresponding motion vector in the directions y.
Component u (x, y) of the corresponding motion vector of pixel in the directions x and point in the directions y are obtained through the above way After measuring v (x, y), average to all u (x, y), and using the average value as horizontal global motion vector θh, to all V (x, y) averages, and using the average value as vertical global vector thetav
Still it uses optical flow method to obtain the motion vector of pixel, i.e., the fortune of pixel is still obtained using formula (3) Dynamic vector, at this time n=1.The motion vector of the pixel of acquisition is subtracted into above-mentioned horizontal global motion vector and obtains uoIt is (each Pixel has corresponded to a uo), the motion vector of the pixel of acquisition is subtracted into above-mentioned vertical global vector and obtains vo (each pixel has corresponded to a vo), and then pass through uoAnd voTo estimate the global rotational motion vector θ of pixelo
Further, in order to the global rotational motion vector θ of quick obtaining pixelo, using simple searching method, Specifically, be exactly at 0 ° arrive α degree range, every rotational motion vector of β degree interval calculation the component in the directions x and The component in the directions y, i.e.,:
ui=ru(1-cosθi)
vi=rusinθi (4)
Wherein:uiIt is the corresponding rotational motion vector of pixel (x, y) in the component in the directions x, viIt is right for pixel (x, y) The rotational motion vector answered is in the component in the directions y, ruFor the rotational motion vector of pixel,θi=β degree, 2 β Degree, 3 β degree ..., α degree.Calculate separately u1With uoMean square error, u2With uoMean square error, u3With uoMean square error ...,With uoMean square error, and will be with uoMean square error minimum uminIt takes out;Calculate separately v1With voMean square error, v2With voMean square error, v3With voMean square error ...,With voMean square error, and will be with voMean square error minimum vmin It takes out;With umin(the corresponding angle of the radian) and vmin(the corresponding angle of the radian) is θiRange, still use above-mentioned public affairs Formula (4) calculates a rotational motion vector in the component in the directions x and component in the directions y every γ degree, and obtains at this time umin(the corresponding angle of the radian) and vmin(the corresponding angle of the radian), as the u finally obtainedmin(the corresponding angle of the radian Degree) and vminWhen the difference of (the corresponding angle of the radian) is less than 0.1 °, by umin(the corresponding angle of the radian) or vmin(the radian Corresponding angle) or umin(the corresponding angle of the radian) and vminThe average value of (the corresponding angle of the radian) is rotated as global Motion vector θo.If u at this timemin(the corresponding angle of the radian) and vminThe difference of (the corresponding angle of the radian) is more than 0.1 °, then Continue to calculate a rotational motion vector in the component in the directions x and component in the directions y every σ degree, until this acquisition umin(the corresponding angle of the radian) and vminThe difference of (the corresponding angle of the radian) is less than 0.1 °.
With α=90 °, β=10 ° are described in detail the above method for γ=1 °.Primary rotation is first calculated every 10 ° Dynamic vector is transported in the component in the directions x and component in the directions y, that is, calculates separately u1With uoMean square error, u2With uoIt is equal Square error, u3With uoMean square error ..., u9With uoMean square error, and will be with uoMean square error minimum uminIt takes out;Point V is not calculated1With voMean square error, v2With voMean square error, v3With voMean square error ..., v9With voMean square error, and It will be with voMean square error minimum vminIt takes out;By uminCorresponding angle, θuminAnd vminCorresponding angle, θvminAs formula (4) θ iniRange, this sentences θvmin> θumin, θumin=1 °, θvminFor=20 °, due to γ=1 °, therefore every 1 degree of calculating Rotational motion vector is in the component in the directions x and component in the directions y, then θ at this timei=1 °, 2 °, 3 ° ..., 19 °, 20 °, calculate separately u at this time1With uoMean square error, u2With uoMean square error, u3With uoMean square error ..., u20With uo's Mean square error, and will be with uoMean square error minimum uminIt takes out;Calculate separately v at this time1With voMean square error, v2With vo's Mean square error, v3With voMean square error ..., v20With voMean square error, and will be with voMean square error minimum vminIt takes Go out;The u that will be obtained at this timeminCorresponding angle, θuminAnd vminCorresponding angle, θvminAs θiRange, if at this time obtaining θuminAnd θvminDifference be less than 0.1 °, then by θuminOr θvminOrAs global rotational motion vector θo.It is no Then, continue to repeat the above process every certain number of degrees, until the θ finally obtaineduminAnd θvminDifference be less than 0.1 °.
It should be noted that depending on the value of α, β, γ, σ ... are by concrete condition in the above process, generally for can be fast Speed obtains the global rotational motion vector θ of pixelo, α generally takes 90 °, and the number of degrees of β, γ, σ ... are then between 1 °~10 °.
So far, the horizontal global motion vector θ of pixel is obtained by above-mentioned optical flow methodh, vertical global arrow Measure θvWith global rotational motion vector θo
The horizontal global motion vector θ of pixel based on acquisitionhWith vertical global vector thetavAnd pixel Global rotational motion vector θoGlobal translation and global rotation are carried out to the pixel, picture is removed in particular by following formula The global motion of vegetarian refreshments,
xd=xu+kxθh+rr(1-coSθo)
yd=yu+kyθv+rrsinθo
Wherein:(xu, yu) refer to the coordinate, (x for removing the pixel after global motiond, yd) refer to not removing global motion Pixel coordinate, rrRefer to pixel at a distance from rotation center, kxRefer to that mobile phone often turns over 1 degree, image is in the horizontal direction Translate how many pixel, kyRefer to that mobile phone often turns over 1 degree, image is in vertical direction translation how many pixel (kxAnd kyValue and camera shooting Head physical characteristic is related, can be measured by the method for experiment).
Pixel (the x of global motion will not be removedd, yd) pixel value as removal global motion pixel (xu, yu) Pixel value, you can with obtain removal global motion pixel.The overall situation is removed to all pixels point in current frame image Movement, then eliminate the global motion of current frame image, and after removing global motion to current frame image, extraction present frame is located in advance Manage the depth image of image.
The depth image (step S12) of present frame pretreatment image is extracted in the present embodiment in order to better understand, first letter Singlely the principle that depth image is extracted from video image is introduced.Fig. 5 is referred to, Fig. 5 is the original of depth image extraction Reason figure, as shown in figure 5, point O indicates that any point in image scene, video camera use the national forest park in Xiaokeng simplified, depending on The optical center position of the corresponding video camera of two field pictures of frequency is respectively A, B, and f indicates that the focal length of video camera, Z indicate point O away from camera shooting The distance of machine, then the picture point of point O different pixel a and b have been corresponded in two field pictures.If pixel a is current frame image The picture point of midpoint O, pixel b are the picture point of previous frame image midpoint O, then pixel b is the match point of pixel a, namely works as The match point of pixel a is pixel b in prior image frame.For pixel a and b, X1It indicates in pixel a and image The distance of the heart, X2Indicate that at a distance from picture centre, if Z is far longer than f, (X can be obtained by geometrical relationship by pixel b1-X2) Value be proportional to the 1/Z reciprocal of distance Zs of the point O away from video camera.Therefore, (X can be passed through1-X2) value be multiplied by a proportionality coefficient To represent the depth information of O points.
Therefore the problem of extracting the depth image of current frame image be converted into extraction current frame image in each pixel with In reference frame image (previous frame image) the problem of the motion vector of Corresponding matching point.And it can also be known from video camera by above-mentioned The absolute value of the corresponding motion vector of pixel of close object is big, the corresponding movement arrow of pixel of the object remote from video camera The absolute value of amount is small, if the information of the motion vector acquired is more accurate, then obtained depth information is also more accurate, Jin Ergen The quality of the 3 D video gone out according to deep image rendering is also better.
Fig. 6 is referred to, Fig. 6 is the flow signal of the depth image of the extraction present frame pretreatment image of the embodiment of the present invention Figure (corresponds to the step S12 in Fig. 3), as shown in fig. 6, the depth image of extraction present frame pretreatment image includes:
S121:Remove the incredible pixel of motion vector in the present frame pretreatment image.
S122:Obtain the color area segmentation information of the present frame pretreatment image.
S123:At least based on the color area segmentation information to eliminating the current of the incredible pixel of motion vector Cavity in frame pretreatment image is filled, and obtains the depth image of present frame pretreatment image.
Step S121 is executed, the incredible pixel of motion vector in the present frame pretreatment image is removed in the present embodiment It puts and includes:Including:
Present frame pretreatment image is matched with previous frame image, obtain present frame pretreatment image pixel with The corresponding positive matching value of its match point and negative relational matching value;
The current frame image is divided into image block with predetermined size;
It removes the error of positive matching value and negative relational matching value described in the present frame pretreatment image and is more than first When the pixel of limit value and the smoothness of described image block are more than the second threshold value, described image pixel in the block.
Present frame pretreatment image with previous frame image match in the present embodiment and obtains present frame pretreatment image The corresponding positive matching value of the matched point of pixel and negative relational matching value block matching method, optical flow method and HRM may be used Which kind of method is method etc. specifically use, determined by the computational complexity of this method.HRM methods are used in the present embodiment, for For HRM methods, although its computational complexity is relatively high, the precision using the motion vector of HRM methods acquisition is higher, therefore The depth information of acquisition is also more accurate, and the effect of the 3 D video rendered is also better.
Specifically, the first step:Estimation is carried out to the image block in present frame pretreatment image.
If present frame pretreatment image is first frame image (current frame image is first frame image) in the present embodiment, adopt Estimation (the ruler of usual image block is carried out to the image block of the present frame pretreatment image with block matching motion estimation method It is very little between 4*4~8*8 pixels), and using the estimation as initial motion vectors.Block matching motion estimation method is existing Method for estimating, therefore not reinflated specific detailed description herein.
If the present frame pretreatment image is not first frame image (current frame image is not first frame image), will work as (match block refers to and present frame pretreatment figure the corresponding motion vector of matched piece of image block of previous frame pretreatment image The image block in the corresponding previous frame image of image block as in), the left image block of the image block of present frame pretreatment image Matched piece of the upper images block of matched piece of corresponding motion vector, the image block of present frame pretreatment image is corresponding Motion vector is as candidate motion vector.In the present embodiment, the size of described image block is preferably 4*4 pixels, described current Left image block, the upper images block of the image block of frame pretreatment image are it is to be understood that with by a secondary 160*160 pixels For image divides image block according to 4*4 pixel sizes, then 40*40 image block is shared, there are one sit for each image block correspondence Mark, if wherein the coordinate of some image block is (2,10), then the coordinate of the image block on its corresponding left side is (2,9), top figure As the coordinate of block is (1,10).
The matching error of three above-mentioned candidate motion vectors is calculated separately, the Candidate Motion arrow of matching error minimum is selected Amount is used as initial motion vectors.The matching error is obtained by following formula:
Wherein:D is matching error, and M is the number of image block pixel in the horizontal direction, and N is image block in Vertical Square To pixel number, Fc(x, y) is that coordinate is the pixel value of the pixel of (x, y), F in present frame pretreatment imager(x+ dx, y+dy) it is that coordinate is (x+d in previous frame imagex, y+dy) pixel pixel value, dxFor present frame pretreatment image The corresponding motion vector of the matched point of pixel is in the component in the directions x, dyFor the pixel of present frame pretreatment image and its Component of the corresponding motion vector of match point in the directions y.Depending on the value of M and N is by actual demand in the present embodiment.
Furthermore, it is necessary to explanation, for present frame pretreatment image is not first frame image, if present frame is located in advance The image block of reason image is located at the boundary of present frame pretreatment image, then still needs to obtain by block matching motion estimation method Initial motion vectors.
Second step:It is (each to the corresponding motion vector of each pixel inside the image block of present frame pretreatment image The corresponding motion vector of pixel refers to the corresponding movement arrow of each matched point of pixel in present frame pretreatment image Amount, the match point of the pixel of present frame pretreatment image refers to corresponding with the pixel in present frame pretreatment image previous Pixel in frame image) initial motion vectors are assigned a value of, motion vector refinement value is obtained using pixel precision matching process.Tool Body, it is obtained by following formula:
D (x, y)=di-|fc(x, y)-fr(x+dx, y+dy)|[ux, uy]T
Wherein:
D (x, y) is the motion vector refinement value of the pixel (x, y) of present frame pretreatment image, diIt is sweared for initial motion Amount, fc(x, y) is that coordinate is the pixel value of the pixel of (x, y), f in present frame pretreatment imager(x+dx, y+dy) it is previous Coordinate is (x+d in frame imagex, y+dy) pixel pixel value, Θ is Grads threshold, related to the smoothness of image.
Third walks:Obtain the corresponding motion vector of each pixel inside the image block of present frame pretreatment image, tool Body, initial motion vectors motion vector refinement value corresponding with each pixel is added, you can to obtain each pixel Corresponding motion vector.
Choose the matching error of the corresponding motion vector of each pixel inside the image block of present frame pretreatment image In matching error corresponding with three candidate motion vectors obtained in the first step, the motion vector corresponding to minimum match error The final motion vector of image block as present frame pretreatment image.Specifically, each pixel corresponding motion vector Matching error is still calculated (M=0 at this time, N=0) using formula (5).
The final motion vector of the image block of the present frame pretreatment image of above-mentioned acquisition is present frame pretreatment image The corresponding positive movement vector of the matched point of image pixel in the block namely present frame pretreatment image image block in The corresponding positive matching value of the matched point of pixel, each image block in present frame pretreatment image is all made of The method stated obtains the final motion vector of described image block, and then can obtain each pixel in present frame pretreatment image The corresponding positive matching value of the matched point of point.And it is corresponding anti-for the matched point of the pixel of present frame pretreatment image Acquisition to motion vector is similar with the acquisition of positive movement vector, unlike by the first step in above-mentioned HRM methods to the Present frame pretreatment image in three steps replaces with previous frame image, and previous frame image is replaced with present frame pretreatment image .
So far, the matched point of pixel that present frame pretreatment image is obtained by above-mentioned HRM methods is corresponding Positive matching value and negative relational matching value.
Detect whether the error between above-mentioned positive matching value and negative relational matching value is more than the first threshold value, if more than then The corresponding motion vector of the matched point of the pixel is insincere, it is on the contrary then to be credible.First thresholding by actual test and It is fixed, can be the length of 3~10 pixels, the value of the first threshold value described in the present embodiment is the length of 5 pixels.
Also judged in the image block by calculating the smoothness of image block in present frame pretreatment image in the present embodiment Including the corresponding motion vector of the matched point of pixel it is credible or insincere.Specifically, the present frame is located in advance Reason image is divided into image block with predetermined size, and the predetermined size makes a reservation for depending on actual demand described in the present embodiment Size is between 4*4~8*8 pixels, it is therefore preferable to 4*4 pixels.To each figure in the present frame pretreatment image after segmentation As the smoothness of block is detected, whether the smoothness for detecting described image block is more than the second threshold value.
The smoothness of image block can be by obtaining the pixel of all pixels point that the image block is included in the present embodiment The mean square deviation of value is weighed, i.e., the smoothness of the image block be it includes all pixels point pixel value mean square deviation, if institute The smoothness for stating image block is more than second threshold value, the then corresponding movement of the matched point of pixel that the image block includes Vector is insincere, it is on the contrary then to be credible, can be 10~100 depending on second thresholding is by actual test, institute in the present embodiment The value for stating the second threshold value is 40.
By above-mentioned judgement, the corresponding movement arrow of the matched point of pixel in present frame pretreatment image is obtained Amount is incredible pixel, then needs the incredible pixel of motion vector being removed, that is, it is pre- to remove the present frame Handle pixel and described image block that the error of positive matching value and negative relational matching value in image is more than the first threshold value When smoothness is more than the second threshold value, described image pixel in the block.
After the incredible pixel removal of motion vector in present frame pretreatment image, lead to present frame pretreatment figure There are many cavities as in, therefore, it is necessary to be filled up to it.Using the color to present frame pretreatment image in the present embodiment Color region is split and (image-region is divided into different color lumps according to the difference of color), obtains the present frame and locates in advance The color area segmentation information for managing image, on the one hand retains present frame pretreatment figure using the segmentation information of the color area On the other hand object edge as in also utilizes the color area being located in the color area segmentation information got around cavity Depth information determine the depth information in cavity that needs fill, and then the cavity in present frame pretreatment image is filled out It fills, to obtain the depth image of present frame pretreatment image.
It should be noted that if the present frame pretreatment image is after eliminating the incredible pixel of motion vector, There is the cavity of large area in the present frame pretreatment image, then in addition to the color using the present frame pretreatment image obtained Outside region segmentation information, it is also necessary to utilize the depth information of the depth image of at least previous frame image of the current frame image of caching Come determine the cavity pixel depth information, and then to eliminating the present frame after the incredible pixel of motion vector Cavity in pretreatment image is filled to obtain the depth image of present frame pretreatment image.Pass through caching in the present embodiment The depth information of depth image of preceding 5 frame the large area cavity in present frame pretreatment image is filled, and depth The depth information of image is obtained by following formula:
Wherein:D (x, y) is the depth information of pixel (x, y), and u (x, y) is the corresponding motion vector of pixel (x, y) Component in the directions x, v (x, y) are component of the corresponding motion vector of pixel (x, y) in the directions y.It is pre-processed in conjunction with present frame The depth information of the depth image of at least previous frame image of the color area segmentation information of image and the current frame image of caching Cavity in present frame pretreatment image is filled as the prior art, therefore not reinflated specific detailed description herein.
Further, since between the image of different frame, the distance of cam movement may change, therefore, overall next Say the ratio of the motion vector length of pixel in every frame image be also can it is changed (the i.e. same object camera fortune Its depth does not change in dynamic process, but is caused corresponding with the object due to the difference of cam movement distance The length of the motion vector of pixel changes), therefore, it needs through normalized method come the corresponding depth map of each frame of unification The ratio of picture is normalized the corresponding depth image of each frame image.
And the depth information of the depth image of the picture frame using caching fills out present frame pretreatment image For the region filled, due to the ratio of the ratio and the depth information of present frame pretreatment image of the depth information of the picture frame of caching Example is different, therefore, is normalized with greater need for the depth image to present frame pretreatment image.Specifically, by following The depth image of present frame pretreatment image is normalized in formula:
Wherein:DrFor the depth information of the pixel of normalized present frame pretreatment image, D is present frame pretreatment figure The depth information of the pixel of picture, DminFor the minimum value of the pixel depth information of present frame pretreatment image, DmaxIt is current The maximum value of the pixel depth information of frame pretreatment image.
After the depth image to the present frame pretreatment image is normalized, in the present embodiment also to normalization after The depth image of present frame pretreatment image be filtered, primarily to pre-processing figure to the present frame after normalization The depth image of picture carries out denoising.Smothing filtering, medium filtering and bilateral (Bilateral) filtering may be used in the present embodiment In a kind of depth image to the present frame pretreatment image after the normalization be filtered.
After the depth image for obtaining the present frame pretreatment image with exact depth information through the above steps, hold Row step S13 obtains the left-eye image and right eye of current frame image based on the present frame pretreatment image and its depth image Image.Specifically,
Using the present frame pretreatment image as eye image or left-eye image;
The depth image of the present frame pretreatment image is mapped, to obtain mapping image;
Empty filling is carried out to the mapping image, obtains corresponding another eye pattern picture.
In the present embodiment, since the depth image to the present frame pretreatment image is also normalized and has been filtered, Therefore the depth image of normalization and filtered present frame pretreatment image is mapped first, according to the one of Physiologic Studies A little conclusions, the information that human brain more depends on right eye to obtain in synthetic stereo image are (next for the people for getting used to the right hand Say), therefore the better viewing effect for helping to improve stereogram of eye image quality, therefore preferably will be described in the present embodiment The depth image of present frame pretreatment image is as eye image, by the normalization and filtered present frame pretreatment image Depth image be mapped as left-eye image, specifically, by following formula realize to it is described normalization and filtered present frame The depth image of pretreatment image is mapped:
xl=xc+k/Z
Wherein, xcFor the abscissa of pixel in the depth image of present frame pretreatment image;xlFor by mapping acquisition The abscissa of pixel in left-eye image;K is proportionality coefficient, represents the corresponding pixel distance of unit depth difference, size is by hand The difference of the physical characteristic for the camera that machine uses and it is different, determined in particular by the method for actual measurement, Z be video camera and object The distance between point.
It should be noted that in the present embodiment, since the depth image to present frame pretreatment image has also carried out normalizing Change and filters, therefore above-mentioned xcIt should be the cross of pixel in the depth image of normalization and filtered present frame pretreatment image Coordinate.
However, the mapping image (left-eye image) obtained by above-mentioned mapping in still will appear some cavity and again Folded, wherein hollow sectors image information caused by due to blocking lacks to be formed, therefore there is still a need for occurring in mapping image Cavity is filled, and in the present embodiment, carrying out cavity filling to the mapping image includes:The mapping image is carried out linear Filtering is with the pixel value in filling cavity region, and the part of the pixel overlapping for occurring in mapping image, then is believed with depth The pixel value of small pixel is ceased as final pixel value.
In addition, in order to reduce the generation in cavity, it can be to normalizing the depth with filtered present frame pretreatment image Degree image is pre-processed before being mapped, and specifically, is located in advance using to normalization and filtered present frame in the present embodiment The depth image of reason image is mapped again after first carrying out low-pass filtering, normalization and filtered present frame pretreatment image Depth image is after low-pass filtering, the edge-smoothing of the depth image of acquisition, and then can reduce the generation in cavity, improves most The quality of the left-eye image rendered eventually.
Using the present frame pretreatment image as eye image in the present embodiment, and normalization will be worked as with filtered The depth image of previous frame pretreatment image is mapped again after being pre-processed, the image after the filling of cavity is as left-eye image; And in other embodiments can also using the present frame pretreatment image as left-eye image, and will be to normalization and filtering after Present frame pretreatment image depth image pre-processed after mapped again, cavity filling after image as right eye figure Picture.
After the left-eye image and eye image that obtain current frame image by the conversion method of above-mentioned video image, it will obtain Left-eye image and eye image input the 3D video display systems obtained can show relief 3D videos.
In addition, by the conversion method of above-mentioned video image, to the left-eye image of each frame image of the 2D videos of acquisition Video coding is carried out with eye image, then can obtain 3 D video code stream.To a left side for the current frame image of the 2D videos of acquisition Eye pattern picture and eye image carry out compression of images, then can obtain static single frames 3D rendering.
Corresponding to the conversion method of above-mentioned video image, the embodiment of the present invention also provides a kind of converting means of video image It sets, refers to Fig. 7, Fig. 7 is the structural schematic diagram of the conversion equipment of the video image of the embodiment of the present invention, as shown in fig. 7, described The conversion equipment of video image includes:
Pretreatment unit A10 carries out pretreatment suitable for the current frame image to two-dimensional video and obtains present frame pretreatment figure Picture, the pretreatment include:Remove the global motion of current frame image.
Depth image extraction unit A11 is connected with the pretreatment unit A10, is suitable for extracting the present frame pretreatment The depth image of image.
Deep image rendering unit A12 is connected with the depth image extraction unit A11, is suitable for being based on the present frame Pretreatment image and its depth image obtain the left-eye image and eye image of current frame image.
The pretreatment unit A10 includes:
Vector field acquiring unit (not shown) is suitable for obtaining the global motion vector field of current frame image and global rotation Turn motion vector field.
Global motion removal unit (not shown), be suitable for global motion vector field based on the current frame image and Global rotational motion vector field carries out global translation and global rotation to current frame image.
In the present embodiment, the pretreatment unit A10 further includes:
Distort removal unit (not shown), is suitable for before the global motion for removing the current frame image described in removal The distortion of current frame image.The distortion removal unit is suitable for the distortion using barrel distortion model removal current frame image.
The depth image extraction unit A11 includes:
Removal unit 110 is connected with the pretreatment unit A10, is transported suitable for removing in the present frame pretreatment image The incredible pixel of dynamic vector.
Segmentation information acquiring unit 111 is connected with the pretreatment unit A10, is suitable for obtaining the present frame pretreatment The color area segmentation information of image.
First empty fills unit 112, is connected with the removal unit 110 and segmentation information acquiring unit 111, fits respectively In at least based on the color area segmentation information to eliminating the present frame pretreatment figure of the incredible pixel of motion vector Cavity as in is filled, and obtains the depth image of present frame pretreatment image.
Wherein, the removal unit 110 includes:
First buffer unit (not shown) is suitable for storing the previous frame image of current frame image.
Matching unit (not shown) is obtained suitable for being matched with previous frame image to present frame pretreatment image The corresponding positive matching value of the matched point of pixel and negative relational matching value of present frame pretreatment image.
Image segmentation unit (not shown), suitable for the present frame pretreatment image is divided into figure with predetermined size As block.
First detection unit (not shown) is adapted to detect for the positive matching of the pixel in present frame pretreatment image Whether the error of value and negative relational matching value is more than the first threshold value.
Second detection unit (not shown) is adapted to detect for the smoothness of the image block of described image cutting unit segmentation Whether the second threshold value is more than.
First removal unit (not shown) is suitable for removing positive matching value described in the present frame pretreatment image The smoothness of the pixel and described image block that are more than the first threshold value with the error of negative relational matching value is more than the second threshold value When, described image pixel in the block.
In the present embodiment, the matching unit with previous frame image match to present frame pretreatment image uses block One kind in method of completing the square, optical flow method and HRM methods.
In the present embodiment, the depth image extraction unit A11 further includes:
Second buffer unit (not shown) is suitable for caching the depth map of at least previous frame image of current frame image Picture.
Described first empty fills unit 112, is further adapted in conjunction with the color area segmentation information and the second buffer unit The depth information of the depth image of at least previous frame image of the current frame image of caching determines the depth of the pixel in the cavity Information is spent, to be filled to the cavity.
The deep image rendering unit A12 includes:
Output unit 120 is connected with the pretreatment unit A10, is suitable for using the present frame pretreatment image as the right side Eye pattern picture or left-eye image output.
Map unit 121 is connected with the described first empty fills unit 112, is suitable for the present frame pretreatment image Depth image mapped, to obtain mapping image.
Second empty fills unit 122, is connected with the map unit 121, is suitable for carrying out cavity to the mapping image Filling obtains corresponding another eye pattern picture.
In the present embodiment, the map unit 121 is suitable for carrying out the depth image of the present frame pretreatment image low It is mapped again after pass filter.
The second empty fills unit 122 includes:
Second filter unit (not shown) is suitable for carrying out linear filtering to the mapping image.
In the present embodiment, the conversion equipment of the video image further includes:Video encoding unit (not shown), is suitable for Left-eye image and eye image to each frame image of the two-dimensional video carry out Video coding, to obtain 3 D video code stream.
Image compression unit (not shown) is suitable for carrying out image to the left-eye image and eye image of the present frame Compression, to obtain 3-D view.
Fig. 8 is referred to, Fig. 8 is the structural schematic diagram of the conversion equipment of the video image of another embodiment of the present invention, Fig. 8 In, the empty fills unit 112 of pretreatment unit B10, removal unit 110, segmentation information acquiring unit 111, first, output unit 120, the second empty fills unit 122 is similar in Fig. 7, the difference is that in the present embodiment, depth image extraction unit B11 further includes other than including removal unit 110, segmentation information acquiring unit 111 and the first empty fills unit 112:
Normalization unit 113 is connected with the described first empty fills unit 112, is suitable for pre-processing the present frame and scheme The depth image of picture is normalized.
First filter unit 114, is connected with the normalization unit 113, is suitable for pre-processing the present frame after normalization The depth image of image is filtered.
Deep image rendering unit B 12 is worked as suitable for being based on the present frame pretreatment image and normalization with filtered The depth image of previous frame pretreatment image obtains the left-eye image and eye image of current frame image.Therefore in Fig. 8, map unit 121 are connected with first filter unit 114, are suitable for the depth by normalization and filtered present frame pretreatment image Image is mapped, to obtain mapping image.
First filter unit 114 is by one kind in smothing filtering, medium filtering and Bilateral filtering to described The depth image of present frame pretreatment image after normalization is filtered.
The process that two-dimensional video is converted to 3 D video by the conversion equipment of the video image of the embodiment of the present invention can join The conversion method for being admitted to the video image stated carries out, and details are not described herein.
In conclusion technical scheme of the present invention at least has the advantages that:
Pretreatment is first carried out by the current frame image to the two-dimensional video and obtains present frame pretreatment image, is then obtained Take the depth image of the present frame pretreatment image, the depth image that is finally based on the present frame pretreatment image maps out Left-eye image or eye image, and using the present frame pretreatment image as another eye pattern picture, it will on mobile phone realizing While two-dimensional video is converted to 3 D video, but also user is making 3D videos by the mobile phone with single camera In the process can be along any direction cell phone, and multi-pass operation need not be carried out and be obtained with the preferable 3D videos of effect, It provide the user great convenience.
The distortion of the current frame image of two-dimensional video is eliminated by pretreated mode and global motion obtains currently The pretreatment image of frame image is removed the incredible pixel of motion vector in present frame pretreatment image, and is based on The color area segmentation information of the present frame pretreatment image is to eliminating the present frame of the incredible pixel of motion vector Cavity in pretreatment image is filled, and then obtains depth image, relative to what is directly obtained by current frame image Its accuracy is high for depth image, therefore also improves the quality of the 3D videos finally obtained.
Further, after the incredible pixel of motion vector in eliminating present frame pretreatment image, using slow The depth information of the depth image of at least previous frame image for the current frame image deposited and the color area of the current frame image Segmentation information is filled the large area cavity in present frame pretreatment image, further increases the present frame of acquisition The accuracy of the depth image of pretreatment image.
It is normalized and is filtered by the depth image to present frame pretreatment image, further improve The accuracy of the depth image of present frame pretreatment image, and then improve the quality of 3D videos.
Carrying out mapping to the depth image of the present frame pretreatment image includes:To the present frame pretreatment image Depth image is mapped again after carrying out low-pass filtering so that the edge of the depth image of the present frame pretreatment image is flat It is sliding, reduce the generation in cavity when mapping the depth image of the present frame pretreatment image, improves final acquisition The left eye of current frame image or the quality of eye image, and then also improve the quality of 3D videos.
Although the invention has been described by way of example and in terms of the preferred embodiments, but it is not for limiting the present invention, any this field Technical staff without departing from the spirit and scope of the present invention, may be by the methods and technical content of the disclosure above to this hair Bright technical solution makes possible variation and modification, therefore, every content without departing from technical solution of the present invention, and according to the present invention Technical spirit to any simple modifications, equivalents, and modifications made by above example, belong to technical solution of the present invention Protection domain.

Claims (26)

1. a kind of conversion method of video image, which is characterized in that including:
Pretreatment is carried out to the current frame image of two-dimensional video and obtains present frame pretreatment image, the pretreatment includes:Removal The global motion of current frame image;
The depth image of the present frame pretreatment image is extracted, including:It removes and moves arrow in the present frame pretreatment image Incredible pixel is measured, the color area segmentation information of the present frame pretreatment image is obtained, is at least based on the color Cavity in present frame pretreatment image of the region segmentation information to eliminating the incredible pixel of motion vector is filled, Obtain the depth image of present frame pretreatment image;
The left-eye image and eye image of current frame image, packet are obtained based on the present frame pretreatment image and its depth image It includes:Using the present frame pretreatment image as eye image or left-eye image;To the depth of the present frame pretreatment image Image is mapped, to obtain mapping image;Empty filling is carried out to the mapping image, obtains corresponding another eye pattern picture.
2. the conversion method of video image as described in claim 1, which is characterized in that the overall situation of the removal current frame image Movement includes:
Obtain global motion vector field and the global rotational motion vector field of current frame image;
Global motion vector field and global rotational motion vector field based on the current frame image carry out current frame image complete Office's translation and global rotation.
3. the conversion method of video image as described in claim 1, which is characterized in that the pretreatment further includes:It is removing The distortion of the current frame image is removed before the global motion of the current frame image.
4. the conversion method of video image as claimed in claim 3, which is characterized in that described in the removal of barrel distortion model The distortion of current frame image.
5. the conversion method of video image as described in claim 1, which is characterized in that the removal present frame pretreatment The incredible pixel of motion vector includes in image:
Present frame pretreatment image is matched with previous frame image, obtain present frame pretreatment image pixel with its With the corresponding positive matching value of point and negative relational matching value;
The present frame pretreatment image is divided into image block with predetermined size;
It removes the error of positive matching value and negative relational matching value described in the present frame pretreatment image and is more than the first threshold value Pixel and described image block smoothness be more than the second threshold value when, described image pixel in the block.
6. the conversion method of video image as claimed in claim 5, which is characterized in that it is described to present frame pretreatment image with Previous frame image carries out matching using one kind in block matching method, optical flow method and mixed iteration matching method.
7. the conversion method of video image as described in claim 1, which is characterized in that described to be at least based on the color area Cavity in present frame pretreatment image of the segmentation information to eliminating the incredible pixel of motion vector be filled including: Believe in conjunction with the color area segmentation information and the depth of the depth image of at least previous frame image of the current frame image of caching Breath determines the depth information of the pixel in the cavity, to be filled to the cavity.
8. the conversion method of video image as described in claim 1, which is characterized in that the depth of the present frame pretreatment image Degree image is the depth image of normalization and filtered present frame pretreatment image.
9. the conversion method of video image as claimed in claim 8, which is characterized in that described to be filtered into smothing filtering, intermediate value One kind in filtering and bilateral filtering.
10. the conversion method of video image as described in claim 1, which is characterized in that described to be pre-processed to the present frame The depth image of image carries out mapping:To the depth image of the present frame pretreatment image carry out after low-pass filtering again into Row mapping.
11. the conversion method of video image as described in claim 1, which is characterized in that described to be carried out to the mapping image It fills in cavity:Linear filtering is carried out to the mapping image.
12. the conversion method of video image as described in claim 1, which is characterized in that further include:To the two-dimensional video The left-eye image and eye image of each frame image carry out Video coding, to obtain 3 D video code stream.
13. the conversion method of video image as described in claim 1, which is characterized in that further include:To a left side for the present frame Eye pattern picture and eye image carry out compression of images, to obtain 3-D view.
14. a kind of conversion equipment of video image, which is characterized in that including:
Pretreatment unit carries out pretreatment suitable for the current frame image to two-dimensional video and obtains present frame pretreatment image, described Pretreatment includes:Remove the global motion of current frame image;
Depth image extraction unit is suitable for extracting the depth image of the present frame pretreatment image;
Deep image rendering unit is suitable for obtaining current frame image based on the present frame pretreatment image and its depth image Left-eye image and eye image;
The depth image extraction unit includes:
Removal unit is suitable for removing the incredible pixel of motion vector in the present frame pretreatment image;
Segmentation information acquiring unit is suitable for obtaining the color area segmentation information of the present frame pretreatment image;
First empty fills unit, it is incredible to eliminating motion vector suitable for being at least based on the color area segmentation information Cavity in the present frame pretreatment image of pixel is filled, and obtains the depth image of present frame pretreatment image, including: Output unit is suitable for exporting the present frame pretreatment image as eye image or left-eye image;Map unit, be suitable for pair The depth image of the present frame pretreatment image is mapped, to obtain mapping image;Second empty fills unit, is suitable for pair The mapping image carries out empty filling, obtains corresponding another eye pattern picture.
15. the conversion equipment of video image as claimed in claim 14, which is characterized in that the pretreatment unit includes:
Vector field acquiring unit is suitable for obtaining global motion vector field and the global rotational motion vector field of current frame image;
Global motion removal unit is suitable for the global motion vector field based on the current frame image and global rotational motion vector Field carries out global translation and global rotation to current frame image.
16. the conversion equipment of video image as claimed in claim 15, which is characterized in that the pretreatment unit further includes: Distort removal unit, suitable for removing the distortion of the current frame image before the global motion for removing the current frame image.
17. the conversion equipment of video image as claimed in claim 16, which is characterized in that the distortion removal unit is suitable for adopting The distortion of current frame image is removed with barrel distortion model.
18. the conversion equipment of video image as claimed in claim 14, which is characterized in that the removal unit includes:
First buffer unit is suitable for storing the previous frame image of current frame image;
Matching unit obtains present frame pretreatment image suitable for being matched with previous frame image to present frame pretreatment image The corresponding positive matching value of the matched point of pixel and negative relational matching value;
Image segmentation unit, suitable for the present frame pretreatment image is divided into image block with predetermined size;
First detection unit is adapted to detect for the positive matching value and negative relational matching value of the pixel in present frame pretreatment image Whether error is more than the first threshold value;
Whether second detection unit, the smoothness for being adapted to detect for the image block of described image cutting unit segmentation are more than the second thresholding Value;
First removal unit is suitable for removing the mistake of positive matching value and negative relational matching value described in the present frame pretreatment image When the smoothness for the pixel and described image block that difference is more than the first threshold value is more than the second threshold value, in described image block Pixel.
19. the conversion equipment of video image as claimed in claim 18, which is characterized in that described to present frame pretreatment image It carries out matching one kind using in block matching method, optical flow method and mixed iteration matching method with previous frame image.
20. the conversion equipment of video image as claimed in claim 14, which is characterized in that the depth image extraction unit is also Including:
Second buffer unit is suitable for caching the depth image of at least previous frame image of current frame image;
Described first empty fills unit is further adapted for working as in conjunction with what the color area segmentation information and the second buffer unit cached The depth information of the depth image of at least previous frame image of prior image frame determines the depth information of the pixel in the cavity, with The cavity is filled.
21. the conversion equipment of video image as claimed in claim 14, which is characterized in that further include:
Normalization unit is normalized suitable for the depth image to the present frame pretreatment image;
First filter unit is filtered suitable for the depth image to the present frame pretreatment image after normalization;
The deep image rendering unit is suitable for based on the present frame pretreatment image and normalization and filtered present frame The depth image of pretreatment image obtains the left-eye image and eye image of current frame image.
22. the conversion equipment of video image as claimed in claim 21, which is characterized in that it is described be filtered into smothing filtering, in One kind in value filtering and bilateral filtering.
23. the conversion equipment of video image as claimed in claim 14, which is characterized in that the map unit is suitable for described The depth image of present frame pretreatment image is mapped again after carrying out low-pass filtering.
24. the conversion equipment of video image as claimed in claim 14, which is characterized in that the described second empty fills unit packet It includes:Second filter unit is suitable for carrying out linear filtering to the mapping image.
25. the conversion equipment of video image as claimed in claim 14, which is characterized in that further include:Video encoding unit is fitted Video coding is carried out in the left-eye image and eye image of each frame image to the two-dimensional video, to obtain 3 D video code Stream.
26. the conversion equipment of video image as claimed in claim 14, which is characterized in that further include:Image compression unit is fitted Compression of images is carried out in the left-eye image and eye image to the present frame, to obtain 3-D view.
CN201210013123.XA 2012-01-16 2012-01-16 The conversion method and device of video image Active CN103208110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210013123.XA CN103208110B (en) 2012-01-16 2012-01-16 The conversion method and device of video image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210013123.XA CN103208110B (en) 2012-01-16 2012-01-16 The conversion method and device of video image

Publications (2)

Publication Number Publication Date
CN103208110A CN103208110A (en) 2013-07-17
CN103208110B true CN103208110B (en) 2018-08-24

Family

ID=48755327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210013123.XA Active CN103208110B (en) 2012-01-16 2012-01-16 The conversion method and device of video image

Country Status (1)

Country Link
CN (1) CN103208110B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103561028A (en) * 2013-11-06 2014-02-05 烽火通信科技股份有限公司 IMS video telephone based on naked-eye 3D technology
CN107666606B (en) 2016-07-29 2019-07-12 东南大学 Binocular panoramic picture acquisition methods and device
CN108537721B (en) * 2017-03-02 2021-09-07 株式会社理光 Panoramic image processing method and device and electronic equipment
US10564174B2 (en) * 2017-09-06 2020-02-18 Pixart Imaging Inc. Optical sensing apparatuses, method, and optical detecting module capable of estimating multi-degree-of-freedom motion
CN108234985B (en) * 2018-03-21 2021-09-03 南阳师范学院 Filtering method under dimension transformation space for rendering processing of reverse depth map
CN111556244B (en) * 2020-04-23 2022-03-11 北京百度网讯科技有限公司 Video style migration method and device
CN111833269B (en) * 2020-07-13 2024-02-02 字节跳动有限公司 Video noise reduction method, device, electronic equipment and computer readable medium
CN116711303A (en) * 2021-01-06 2023-09-05 华为技术有限公司 Three-dimensional video call method and electronic equipment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7573475B2 (en) * 2006-06-01 2009-08-11 Industrial Light & Magic 2D to 3D image conversion
CN101968895A (en) * 2009-07-27 2011-02-09 鸿富锦精密工业(深圳)有限公司 Two-dimensional image conversion system and method
CN102163331A (en) * 2010-02-12 2011-08-24 王炳立 Image-assisting system using calibration method
CN102098526B (en) * 2011-01-28 2012-08-22 清华大学 Depth map calculating method and device

Also Published As

Publication number Publication date
CN103208110A (en) 2013-07-17

Similar Documents

Publication Publication Date Title
CN103208110B (en) The conversion method and device of video image
CN108734776B (en) Speckle-based three-dimensional face reconstruction method and equipment
CN104504671B (en) Method for generating virtual-real fusion image for stereo display
US9445072B2 (en) Synthesizing views based on image domain warping
US8711204B2 (en) Stereoscopic editing for video production, post-production and display adaptation
CN101902657B (en) Method for generating virtual multi-viewpoint images based on depth image layering
CN109035394B (en) Face three-dimensional model reconstruction method, device, equipment and system and mobile terminal
CN101247530A (en) Three-dimensional image display apparatus and method for enhancing stereoscopic effect of image
US20120139906A1 (en) Hybrid reality for 3d human-machine interface
WO2010119852A1 (en) Arbitrary viewpoint image synthesizing device
CN109919911A (en) Moving three dimension method for reconstructing based on multi-angle of view photometric stereo
CN104506872B (en) A kind of method and device of converting plane video into stereoscopic video
CN109769109A (en) Method and system based on virtual view synthesis drawing three-dimensional object
CN105812766B (en) A kind of vertical parallax method for reducing
CN105612742A (en) Remapping a depth map for 3D viewing
CN104853175B (en) Novel synthesized virtual viewpoint objective quality evaluation method
CN109218706B (en) Method for generating stereoscopic vision image from single image
Zhang et al. Adaptive reconstruction of intermediate views from stereoscopic images
CN111899293B (en) Virtual and real shielding processing method in AR application
CN106169179A (en) Image denoising method and image noise reduction apparatus
Hanhart et al. Free-viewpoint video sequences: A new challenge for objective quality metrics
Kao Stereoscopic image generation with depth image based rendering
Knorr et al. From 2D-to stereo-to multi-view video
Sharma et al. A novel image fusion scheme for ftv view synthesis based on layered depth scene representation & scale periodic transform
US20190297319A1 (en) Individual visual immersion device for a moving person

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20190314

Address after: 101399 Building 8-07, Ronghui Garden 6, Shunyi Airport Economic Core Area, Beijing

Patentee after: Xin Xin finance leasing (Beijing) Co.,Ltd.

Address before: 201203 Shanghai Pudong New Area Pudong Zhangjiang hi tech park, 2288 Chong Nong Road, exhibition center, 1 building.

Patentee before: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20130717

Assignee: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Assignor: Xin Xin finance leasing (Beijing) Co.,Ltd.

Contract record no.: X2021110000008

Denomination of invention: Video image conversion method and device

Granted publication date: 20180824

License type: Exclusive License

Record date: 20210317

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20221021

Address after: 201203 Shanghai city Zuchongzhi road Pudong New Area Zhangjiang hi tech park, Spreadtrum Center Building 1, Lane 2288

Patentee after: SPREADTRUM COMMUNICATIONS (SHANGHAI) Co.,Ltd.

Address before: 101399 Building 8-07, Ronghui Garden 6, Shunyi Airport Economic Core Area, Beijing

Patentee before: Xin Xin finance leasing (Beijing) Co.,Ltd.