CN104506872B

CN104506872B - A kind of method and device of converting plane video into stereoscopic video

Info

Publication number: CN104506872B
Application number: CN201410697508.1A
Authority: CN
Inventors: 张新; 柯家琪; 廖智宏
Original assignee: SHENZHEN KAIAOSI TECHNOLOGY Co Ltd
Current assignee: Hangzhou Youshu Technology Co., Ltd.
Priority date: 2014-11-26
Filing date: 2014-11-26
Publication date: 2017-09-29
Anticipated expiration: 2034-11-26
Also published as: CN104506872A

Abstract

The invention discloses a kind of method and device of converting plane video into stereoscopic video, methods described includes performing following steps to each two field picture of planar video：S1, the depth map D for obtaining current frame image：First depth map D is obtained by the estimation based on Block- matching₁；By edge detection algorithm and Hough transformation algorithm, the second depth map D is built based on geometrical perspective relation₂；3rd depth map D is estimated by the method based on colouring information₃；To D₁、D₂And D₃Perform depth integration and obtain depth map D；S2, based on DIBR algorithms, by generating multi-viewpoint three-dimensional view with reference to figure and depth map D；S3, the three-dimensional video-frequency call format according to user, at least partly described right and left eye view execution solid is chosen from multi-viewpoint three-dimensional view and is rendered, to generate the color solid video of corresponding format.The three-dimensional video-frequency of the method generation of the present invention, not only stereoeffect is good, and the three-dimensional video-frequency of different-format can be generated according to user's request.

Description

A kind of method and device of converting plane video into stereoscopic video

Technical field

The present invention relates to a kind of method and device of converting plane video into stereoscopic video.

Background technology

Converting plane video into stereoscopic video technology, also known as 2D turn 3D technology, refer to existing planar video using necessary Technological means, fully excavates the depth information in planar video, and the virtual scene that multiple views are observed is simulated according to depth information, from And reach the effect of three-dimensional perception.In three-dimensional video-frequency technology, three-dimensional perceived effect is realized by binocular stereo vision.Binocular is stood Body vision utilizes the principle of binocular imaging, and picture is perceived as by simulating binocular, and it is special that left and right two-way image or video are used The right and left eyes of people that project respectively of means in, the brain of people can reconstruct the stereo scene in image or video, reach solid Perceived effect.

Stereo-picture as a kind of mode of new description three-dimensional world, it not only comprising conventional planar image on The surface information of scene, but also include the 3 D stereo information related to scene particular location, i.e. depth information.With it is traditional Planar video is compared, and three-dimensional video-frequency can carry out more real reflection to the concrete scene of objective world.

The depth map of scene can be obtained by stereoscopic vision algorithm, depth map can reflect plane picture correspondence scene Front and rear or far and near relation.Generally in the application, depth map is represented with the gray level image of 8 locating depths.On depth map certain The value of point is 0, represents that the image of the corresponding flat image point is located at certain point on the farthest in the range of relative depth, depth map Depth value is 255, represent to should be on location point plane picture be located in the range of relative depth most nearby, in 0~255 model Other values in enclosing then represent the depth in the range of some relative depth.

Current 3D video techniques have been achieved for more long-range development, and in the market, which is occurred in that from high-end, to low side one is Row stereoscopic video acquisition equipment.By technological accumulation for many years, the price of stereoscopic display device is also gradually popular, and 3D TVs are opened Beginning comes into increasing average family.But the behind of 3D Industry Prosperities in recent years, there is high-end stereoscopic acquisition equipment and hold high Expensive, high-quality three-dimensional film source shortage, the problem such as artificial 3D video productions are with high costs, these problems are increasingly becoming the development of 3D videos Bottleneck.

In addition, there is also the stereoscopic display device based on a variety of displaying principles, such as raster pattern, shutter, polarization in the market Formula etc..Usual shutter and polarization type show that establishing needs to wear special anaglyph spectacles and could watched that the display of raster pattern is set It is standby then do not need Special spectacles to be presented stereo scene, but the three-dimensional video-frequency lattice that the stereoscopic display device of raster pattern is supported Formula has point of binocular form and multiple views form, and existing three-dimensional video-frequency processing unit is often only capable of exporting a certain form Three-dimensional video-frequency, and the stereoeffect of the three-dimensional video-frequency exported is poor, significantly limit the use of three-dimensional video-frequency processing unit Scope.

The content of the invention

It is existing to solve it is a primary object of the present invention to propose a kind of method and device of converting plane video into stereoscopic video Three-dimensional video-frequency processing unit output the technical problem that three-dimensional video-frequency form is single and stereoeffect is not good enough.

The method of converting plane video into stereoscopic video proposed by the present invention is as follows：

A kind of method of converting plane video into stereoscopic video, including following steps are performed to each two field picture of planar video：

S1, the depth map D for obtaining current frame image：First depth map D is obtained by the estimation based on Block- matching₁； The end point and vanishing line in current frame image are extracted by edge detection algorithm and Hough transformation algorithm, according to picture depth with Relation between end point and vanishing line builds the second depth map D₂；3rd depth map is estimated by the method based on colouring information D₃；To the first depth map D₁, the second depth map D₂With the 3rd depth map D₃Depth integration is performed, to obtain State the depth map D of current frame image；

S2, based on DIBR algorithms, by generating multi-viewpoint three-dimensional view with reference to figure and the depth map D, wherein the reference Figure is the current frame image, and the multi-viewpoint three-dimensional view includes multipair right and left eye view；

S3, according to the three-dimensional video-frequency output format of user require, choose at least one pair of from the multi-viewpoint three-dimensional view The right and left eye view performs solid and rendered, to generate the color solid video of corresponding format.

The method of above-mentioned converting plane video into stereoscopic video, different depth are obtained to each two field picture using different methods Figure, then the depth map that these are obtained in different ways is weighted fusion, obtain a final depth of each two field picture Figure, then based on the final depth map and the two field picture, multi-viewpoint three-dimensional view is generated using DIBR algorithms, perform three-dimensional wash with watercolours Three-dimensional video-frequency is generated after dye.Due to trying to achieve multiple depth maps by distinct methods to single-frame images in this programme, then carry out depth Degree fusion obtains final depth map, and follow-up processing, therefore the solid finally given are performed based on the final depth map The stereoeffect of video is good, and it is multi-viewpoint three-dimensional view to be additionally, since obtain, can therefrom select different views to (one Individual view is to including left-eye image and eye image), the three-dimensional video-frequency of different-format is generated, for example, selecting wherein a certain viewpoint A pair of three-dimensional views, generate red blue form three-dimensional video-frequency, binocular form three-dimensional video-frequency or side-by-side form stereopsis Frequently；It can also select multipair three-dimensional view, generation multiple views form three-dimensional video-frequency or row interleaving format three-dimensional video-frequency, user can be with The three-dimensional video-frequency that solid renders the corresponding format to obtain is carried out according to the displaying principle of its stereoscopic display device, for different displays The stereoscopic display device of principle is shown.

The device of converting plane video into stereoscopic video proposed by the present invention is as follows：

A kind of device of converting plane video into stereoscopic video, including control module, cache module, video conversion module and solid Rendering module；The cache module rgb video pending for storing and the intermediate result of processing；The Video Quality Metric mould Block is connected with the cache module, the three-dimensional rendering module respectively, for by the plane picture of the pending rgb video Multi-viewpoint three-dimensional view is converted to, and the multi-viewpoint three-dimensional view is inputted to the three-dimensional rendering module, the multiple views Three-dimensional view includes multipair right and left eye view；The three-dimensional rendering module is used to be required according to the three-dimensional video-frequency output format of user The right and left eye view of at least one pair of is chosen from the multi-viewpoint three-dimensional view, and the right and left eye view of selection is held Row solid is rendered, and generates the color solid video of corresponding format；The control module respectively with the video conversion module, described Three-dimensional rendering module connection, for being required to configure described device according to user, the user requires to include the solid Video output formats requirement.

The device for the above-mentioned converting plane video into stereoscopic video that the present invention is provided compared with prior art, with advantages below：Can With according to three-dimensional video-frequency call format of the user to output, from the multipair right and left eye view of multi-viewpoint three-dimensional view, carry out not Same selection, carries out solid by the three-dimensional view of selection and renders, generate the three-dimensional video-frequency of corresponding format, the present apparatus disclosure satisfy that not With the stereoscopic display device of displaying principle, the scope of application is extremely wide.

Brief description of the drawings

Fig. 1 is a kind of method flow diagram for converting plane video into stereoscopic video that the specific embodiment of the invention is provided；

Fig. 2 is the particular flow sheet of the step 40 in Fig. 1；

Fig. 3 is the schematic diagram that edge detection algorithm is realized in FPGA；

Fig. 4 is the schematic diagram that Hough transformation algorithm is realized in FPGA；

Fig. 5 is that the schematic diagram that bilateral filtering is realized in FPGA is carried out to depth map D；

Fig. 6 is a kind of device block diagram for converting plane video into stereoscopic video that the specific embodiment of the invention is provided；

Fig. 7 is a kind of operation principle block diagram of specific embodiment of video conversion module in Fig. 6；

Fig. 8 is a kind of operation principle block diagram of specific embodiment of video input module in Fig. 6；

Fig. 9 is a kind of operation principle block diagram of specific embodiment of Video Output Modules in Fig. 6；

Figure 10 is a kind of operation principle block diagram of specific embodiment of cache module in Fig. 6.

Embodiment

The invention will be further described with reference to the accompanying drawings and detailed description.

The embodiment of the present invention provides a kind of method of converting plane video into stereoscopic video, and this method is using FPGA as core Heart processing apparatus, hardware design is realized by FPGA, and this method is included to each frame figure in pending video (planar video) As performing following steps, Fig. 1 is referred to：

Step 10：Start

Step 21：First depth map D is obtained by the estimation based on Block- matching₁

Step 22：End point and the disappearance in current frame image are extracted by edge detection algorithm and Hough transformation algorithm Line, the second depth map D is built according to the relation between picture depth and end point and vanishing line₂

Step 23：3rd depth map D is estimated by the method based on colouring information₃

Step 30：To the first depth map D₁, the second depth map D₂With the 3rd depth map D₃Depth integration is performed, obtains described The depth map D of current frame image

Step 40：Based on DIBR (Depth Image Based Rendering, the drafting based on depth image) algorithm, By a width with reference to figure and width depth map D generation multi-viewpoint three-dimensional views, wherein described is the current frame image with reference to figure

Step 50：According to the three-dimensional video-frequency call format of user, chosen at least partly from the multi-viewpoint three-dimensional view View performs solid and rendered, to generate the color solid video of corresponding format

It is to be appreciated that in above-mentioned step, step 21,22 and 23 can be performed simultaneously.

For step 21, a kind of specific algorithm FSBMA (Full-search block matching algorithm) can be used, including：Assuming that pair One current frame image I₁The step 21 is performed, then also needs to extract the previous frame image I of current frame image₂As reference frame, to current Frame and reference frame, using the estimation based on Block- matching, calculate the first motion vector, according to current frame image I₁With the first fortune Dynamic vector obtains prediction two field picture I_pre；Again to predict two field picture I_preIt is used as reference frame, the previous frame image I₂As current Frame, calculates the second motion vector in foregoing method, the current frame image I is obtained according to second motion vector₁ One depth map D₁, and the first depth map D₁In the gray value each put be I₁Previous frame image I₂In each pixel motion arrow The modulus value of amount.Specifically, carry out foregoing estimation ask for motion vector can be hard-wired average exhausted using being more suitable for Angle value criterion (MAD) block matching criterion, i.e.,：

In above-mentioned formula (1), I₁(x, y) is the grey scale pixel value of present frame, I₂(x+u, y+v) is the pixel of reference frame Gray value, N is the macroblock size of selection, and (x, y) represents motion vector (or displacement vector), and (u, v) represents the picture in macro block Plain coordinate.Try to achieve foregoing the first motion vector and the second motion vector respectively in the method for formula (1).

For step 22, the edge detection algorithm uses the horizontal operator of sobel operators, respectively detection level edgeAnd the vertical operator of detection vertical edgeThe edge detection algorithm in FPGA Realization principle is as shown in figure 3, input original image (i.e. described current frame image), is calculated using above-mentioned horizontal operator with vertical Son, performs level (i.e. laterally) rim detection and vertical (i.e. longitudinal) rim detection respectively, then by transverse edge detection image with Longitudinal edge detection image carries out gradient combination, carries out threshold processing, exports edge-detected image.

The Hough transformation algorithm represents straight line using polar equation, and specific equation is as follows：

ρ=xcos θ+ysin θ, 0≤θ ＜ 180 (2)

In above-mentioned formula (2), ρ represents origin to the vertical line distance of straight line, and θ is vertical line and x-axis side of the origin to straight line To angle, x be pixel with respect to origin row coordinate, y for pixel with respect to origin row coordinate.Specific reality in FPGA Existing principle as shown in figure 4, using the edge-detected image obtained in Fig. 3 as Hough transformation algorithm input, according in Fig. 4 Calculation process, exports straight line parameter, obtains vanishing line, and according to geometrical perspective relation, " intersection point of vanishing line is end point, end point For the point that depth is maximum, picture depth changes to minimum along vanishing line from maximum ", obtain the second depth of the current frame image Scheme D₂。

, can be according to being implemented as described below for step 23：To the current frame image, calculate each pixel blue component and The difference (being designated as the first difference) of red component, the difference (being designated as the second difference) of blue component and green component, then by One difference does product calculating with the second difference, and obtained result is used as the 3rd depth map D₃The pixel value of middle respective pixel, so that Form the 3rd complete depth map D₃。

Structure design is calculated using hardware concurrent in FPGA, parallel processing array is constituted simultaneously in multiple processing units Calculate the first depth map D₁, the second depth map D₂With the 3rd depth map D₃, improve real-time.

For step 30, depth map Weighted Fusion, the ultimate depth figure D required for obtaining, the side of implementing can be used Method is as follows：

To the first depth map D₁, the second depth map D₂With the 3rd depth map D₃Perform depth map Weighted Fusion D=α D₁+βD₂+ γD₃, wherein alpha+beta+γ=1, and different α, β, γ values are configured according to different video scenes, in this way, based on depth map D The image definition of the three-dimensional video-frequency generated through subsequent treatment is high and stereoeffect is good.For example, artificial scene in current frame image When more, weight coefficient is configured to artificial scene, i.e. the ＜ γ ＜ 0.1 of 0.5 ＜ α ＜, 1,0.2 ＜ β ＜ 0.5,0, for example：α= 0.6875, β=0.25, γ=0.0625；When natural scene is more in current frame image, weight coefficient is configured to natural scene, That is the ＜ γ ＜ 0.5 of 0.5 ＜ α ＜, 1,0 ＜ β ＜ 0.1,0.2, for example：α=0.6875, β=0.0625, γ=0.25.

For step 40, using drafting (DIBR) algorithm based on depth image, DIBR is used as using the current frame image Needed for algorithm with reference to figure, depth map D is carried out after bilateral filtering as the depth map needed for the algorithm so that, by width ginseng Examine figure and width depth map generation multi-viewpoint three-dimensional view.Detailed process is as follows：With reference to Fig. 2, depth map D is carried out first Bilateral filtering as shown in Figure 5 is to obtain more smooth depth map, and the formula of bilateral filtering is as follows,

In above-mentioned formula (3), BF [I]_pDepth map as after bilateral filtering, W_pRepresent normalized parameter (with by depth Value is converted between 0~255), G_σs、G_σrRepresent the Gaussian function using σ s, σ r as standard deviation, I_p、I_qRepresent depth map D's respectively Pixel p, pixel q grey scale pixel value, S represent pixel p neighborhood.As shown in fig. 7, input depth map D, is calculated respectively The Gauss weighted index of similarity degree between pixel | I_p-I_q| and space length Gauss weight | | p-q | |, using above-mentioned formula (3) carry out using negative exponent computing module during bilateral filtering, CORDIC (Coordinate Rotation are used in FPGA Digital Computer, Coordinate Rotation Digital is calculated) algorithm calculates the hyperbolic cosine and hyperbolic sine of variate-value, hyperbolic respectively Cosine result subtracts hyperbolic sine string result and obtains negative exponential function result of calculation.In view of facilitating FPGA to realize and taking into account conversion S takes p eight neighborhood in the effect of the three-dimensional video-frequency gone out, this example, and σ s take one of them in 4,8,16,32, σ r take 0.5,0.25, 0.125th, one of them in 0.0625.

Smooth depth map BF [I] is obtained by above-mentioned bilateral filtering_pAfterwards, based on the current frame image (with reference to figure) and The depth map BF [I]_pMapped to generate multi-viewpoint three-dimensional view using image mapping equation, refer to Fig. 2, the figure As mapping equation is as follows：

In above-mentioned formula (4-1) and (4-2), x_cRepresent the pixel abscissa with reference to figure, x_lRepresent the picture of left-eye view Vegetarian refreshments abscissa, x_rRepresent the pixel abscissa of right-eye view；t_xThe parallax range of right and left eye view is represented, changes t_x Change the parallax of right and left eye view, f is right and left eye view virtual video camera focal length, in this example, it is at respective pixel to take f=1, Z Depth value.By changing parallax range t_x, multipair right and left eye view is resulted in, is regarded so as to form the multi-viewpoint three-dimensional Figure.Necessary cavity then can be carried out to the multi-viewpoint three-dimensional view using mean filter to fill and repair.

For step 50, specifically, for example, can be：Adding user needs to obtain red blue form or side-by-side The three-dimensional video-frequency of form, then select a pair of right and left eye views from multi-viewpoint three-dimensional view, carries out solid and renders；If needed Multi-viewpoint three-dimensional video or row intertexture three-dimensional video-frequency are obtained, then needs to choose multipair right and left eye view and carries out solid and render. So as to generate the color solid video of multiple format.

The embodiment of the present invention also provides a kind of device of converting plane video into stereoscopic video, as shown in fig. 6, the dress Put and be mainly based upon FPGA to build each operational module, the device include control module, cache module, video conversion module and Three-dimensional rendering module；The cache module rgb video pending for storing and the intermediate result of processing；The video turns Mold changing block is connected with the cache module, the three-dimensional rendering module respectively, for by the plane of the pending rgb video Image is converted to multi-viewpoint three-dimensional view, and the multi-viewpoint three-dimensional view is inputted to the three-dimensional rendering module, described many Viewpoint three-dimensional view includes multipair right and left eye view；The three-dimensional rendering module is used for the three-dimensional video-frequency output format according to user It is required that choosing the corresponding right and left eye view from the multi-viewpoint three-dimensional view, and the right and left eye view of selection is held Row solid is rendered, and generates the color solid video of corresponding format；The control module respectively with the video conversion module, described Three-dimensional rendering module connection, for being required to configure described device according to user, the user requires to include the solid Video output formats requirement.

With reference to Fig. 7, in some specific embodiments, the video conversion module can include the first depth estimation mould Block, the second depth estimation module, the 3rd depth estimation module, depth integration module and multi-viewpoint three-dimensional view generation module, institute State the first depth estimation module, the second depth estimation module and the 3rd depth estimation module with the cache module With depth integration module connection.

The first depth estimation module, which is used to perform a current frame image of the pending rgb video, is based on block The estimation of matching, to obtain the first depth map D₁, step 21 of the concrete implementation method with reference to preceding method.

The second depth estimation module, which is used to performing geometrical perspective relation to the current frame image, to be estimated, to obtain the Two depth map D₂, concrete implementation mode is with reference to foregoing step 22 and Fig. 3, Fig. 4.

The 3rd depth estimation module is used to perform the estimation based on colouring information to the current frame image, to obtain 3rd depth map D₃, concrete implementation mode is with reference to foregoing step 23.

The depth integration module is used for the first depth map D1, the second depth map D2 and the 3rd depth Scheme D3 and perform depth map Weighted Fusion, to obtain the depth map D of the current frame image, concrete implementation mode is with reference to foregoing Step 30.

The multi-viewpoint three-dimensional view generation module is used to be based on the depth map D and the current frame image, generates institute State multi-viewpoint three-dimensional view.As shown in fig. 7, generating multi-viewpoint three-dimensional view after being handled through the video conversion module, then pass through The three-dimensional rendering module, corresponding right and left eye view progress solid is selected from multi-viewpoint three-dimensional view and is rendered, can be generated not With the three-dimensional video-frequency of form, the method for referring to the converting plane video into stereoscopic video of foregoing offer will not be repeated here.

In some specific embodiments, with reference to Fig. 6, described device may also include video input module and video frequency output mould Block：The video input module is connected with the cache module, is regarded for inputting the pending RGB to the cache module Frequently；The Video Output Modules are connected with the three-dimensional rendering module, for by the color solid video frequency output；The control Module include data configuration module and man-machine communication module, wherein, the data configuration module respectively with the Video Quality Metric mould Block, the three-dimensional rendering module, the video input module, the Video Output Modules, man-machine communication module connection.More Plus preferably, the man-machine communication module includes man-machine interface and host computer communication module, wherein man-machine interface can include pressing Keyboard plate and remote control module, user can be completed to the dress by key panel, remote control module or host computer communication module The configuration put.

In more preferred embodiment, with reference to Fig. 8, the video input module specifically includes vision signal TIP 100th, video input converter group 300 and input signal selector 500, wherein, the input signal selector 500 with it is described Data configuration module is connected；With reference to Fig. 9, the Video Output Modules specifically include output signal selection device 200, video frequency output and turned Parallel operation group 400 and vision signal output slab 600, wherein, the output signal selection device connects with the data configuration module Connect；As shown in FIG. 8 and 9, the vision signal TIP 100 and the vision signal output slab 600 include a variety of Video interface, multiple interfaces such as AV interfaces, bnc interface, USB interface, the video input converter group 300 and described Video frequency output converter group 400 include respectively correspond to every kind of video interface multiple converters, such as AV converters, BNC converters, VGA converters etc.；As shown in Fig. 6,8 and 9, the man-machine communication module is used to input user's requirement, institute Data configuration module is stated to be required respectively to the input signal selector 500, the signal output selector according to the user 200th, the video conversion module, the three-dimensional rendering module carry out the configuration.Citing：Bnc interface will be come from if desired Planar video be converted to three-dimensional video-frequency, and played out with the output of AV interface shapes, then user inputs from man-machine communication module Configuration data, makes data configuration module configure input signal selector 500, with from the video input converter group BNC converters are selected in 300, the planar video that bnc interface is inputted is changed, the video conversion module is converted to general 24bit rgb videos (plane), then input to the cache module or input buffer module and video conversion module simultaneously, The method for performing foregoing converting plane video into stereoscopic video, user is according to stereoscopic display device used, to three-dimensional rendering module Data configuration is carried out, to render the form matched with the stereoscopic display device used in it (such as red blue form, multiple views lattice Formula, side-by-side forms, row interleaving format etc.) color solid video, user carries out to output signal selection device 200 again Data configuration, makes output signal selection device 200 select AV converters from the video frequency output converter group 400, by solid The color solid Video Quality Metric of rendering module output is into the three-dimensional video-frequency with AV Interface Matchings, you can output is played.

User can be wrapped by man-machine communication module to data configuration module input data with carrying out configuration to described device Various configurations are included, in addition to described above, it is also an option that the video scene (artificial scene or natural scene) is vertical to adjust The effect of body conversion.Wherein data configuration module is that the configuration data for being responsible for inputting user is configured to corresponding module.

The remote control module is for example including Digiplex and remote control signal receiving module, and user can pass through remote control module Data configuration is carried out, the configuration of video signal input interface is completed at a distance using Digiplex, vision signal output connects The configuration of mouth, the selection of the three-dimensional video-frequency form of output, the regulation of three-dimensional video-frequency conversion effect, the choosing of video scene translative mode The operation such as select.The key panel includes the sub- button of multiple functions, and user can be inputted by corresponding key configurations vision signal Interface, configures video signal output interface, and configuration output three-dimensional video-frequency form, regulation three-dimensional video-frequency conversion effect etc. is operated.

The host computer communication module can be communicated with certain means of communication with PC host computers, be communicated by host computer Module, can be transferred to host computer by the current status information of device, can also on host computer by the module complete to dress The configuration put.

The cache module includes the outer Large Copacity SDRAM of storage control module and piece in FPGA pieces, with reference to Figure 10, is used for Cache the 24bit RGB digital video signals and transfer algorithm of front-end module (example video input module as the aforementioned) input The data such as intermediate result and final result.The storage control module uses FPGA ram in slice, FIFO control modules, SDRAM controls The asynchronous high-capacity FIFO structure caching of module composition processed.In addition, being realized by the sdram controller in the storage control module To SDRAM random read-write.

Above content is to combine specific preferred embodiment further description made for the present invention, it is impossible to assert The specific implementation of the present invention is confined to these explanations.For those skilled in the art, do not taking off On the premise of from present inventive concept, some equivalent substitutes or obvious modification can also be made, and performance or purposes are identical, all should When being considered as belonging to protection scope of the present invention.

Claims

1. a kind of method of converting plane video into stereoscopic video, including following steps are performed to each two field picture of planar video：

S1, the depth map D for obtaining current frame image：First depth map D is obtained by the estimation based on Block- matching₁；Pass through side Edge detection algorithm and Hough transformation algorithm extract end point and vanishing line in current frame image, according to picture depth and end point Relation between vanishing line builds the second depth map D₂；3rd depth map D is estimated by the method based on colouring information₃；To institute State the first depth map D₁, the second depth map D₂With the 3rd depth map D₃Depth integration is performed, to obtain the present frame The depth map D of image；Depth map D is carried out bilateral filtering to obtain smoother depth map BF [I]_p；

S2, based on DIBR algorithms, by with reference to figure and depth map BF [I]_pMulti-viewpoint three-dimensional view is generated, wherein described be with reference to figure The current frame image, the multi-viewpoint three-dimensional view includes multipair right and left eye view；

S3, the three-dimensional video-frequency call format according to user, choose at least one pair of described left and right from the multi-viewpoint three-dimensional view Eye view performs solid and rendered, to generate the color solid video of corresponding format；

Wherein, the formula of the bilateral filtering is as follows：

In above-mentioned formula, BF [I]_pFor the depth map after bilateral filtering, W_pIt is for depth value to be converted between 0~255 Normalized parameter, G_σs、G_σrRepresent the Gaussian function using σ s, σ r as standard deviation, I_p、I_qRepresent respectively depth map D pixel p, Pixel q grey scale pixel value, | I_p-I_q| the Gauss weighted index of the similarity degree between pixel p, q is represented, | | p-q | | represent The Gauss weight of space length between pixel p, q, S represents pixel p neighborhood, and wherein S takes p eight neighborhood, σ s take 4,8,16, One of them in 32, σ r take one of them in 0.5,0.25,0.125,0.0625；

Negative exponent computing module is used when carrying out the bilateral filtering using above-mentioned formula, Coordinate Rotation Digital meter is used in FPGA Hyperbolic cosine and hyperbolic sine that algorithm calculates variate-value respectively are calculated, hyperbolic cosine result subtracts hyperbolic sine result and obtains negative refer to Number function result of calculation.

2. the method as described in claim 1, it is characterised in that：The first depth map D is obtained in the step S1₁Specific bag Include：The two continuous frames image extracted in caching is respectively I₁、I₂, with I₁It is used as present frame, I₂It is used as reference frame, wherein I₂For I₁'s Former frame, by the estimation based on Block- matching, calculates the first motion vector, according to present frame I₁With the described first motion arrow Amount, obtains prediction frame I_pre；Again to predict frame I_preIt is used as reference frame, I₂As present frame, estimated by the motion based on Block- matching Meter, calculates the second motion vector, according to second motion vector, obtains the first depth map D₁, wherein, described first is deep Degree figure D₁In the gray value each put be I₂In each pixel motion vector modulus value.

3. method as claimed in claim 2, it is characterised in that：The estimation is carried out using parallel algorithm in FPGA Design of Hardware Architecture, and constitute parallel processing array to calculate first depth map simultaneously in FPGA multiple processing units D₁, the second depth map D₂With the 3rd depth map D₃。

4. the method as described in claim 1, it is characterised in that：Picture depth described in the step S1 and end point and disappearance Relation between line is：The intersection point of vanishing line is end point, and end point is the maximum point of depth, and picture depth is along vanishing line from most Change to minimum greatly.

5. the method as described in claim 1, it is characterised in that：The 3rd depth image D is obtained in the step S1₃Specifically Including：Calculate in the current frame image the first difference of the blue component of each pixel and red component, blue component with it is green Second difference of colouring component, each picture of the 3rd depth map is used as using the product of first difference and second difference The pixel value of element, so as to obtain the 3rd depth map D₃。

6. the method as described in claim 1, it is characterised in that：It is described to obtain that the depth integration is carried out in the step S1 Depth map D is specifically included：To the first depth map D₁, the second depth map D₂With the 3rd depth map D₃Perform depth Figure Weighted Fusion D=α D₁+βD₂+γD₃, wherein alpha+beta+γ=1.

7. method as claimed in claim 6, it is characterised in that：The video scene includes artificial scene and natural scene, when When video scene is artificial field scape, the ＜ γ ＜ 0.1 of 0.5 ＜ α ＜, 1,0.2 ＜ β ＜ 0.5,0；When video scene is natural scene, The ＜ γ ＜ 0.5 of 0.5 ＜ α ＜, 1,0 ＜ β ＜ 0.1,0.2.