CN104506872B - A kind of method and device of converting plane video into stereoscopic video - Google Patents

A kind of method and device of converting plane video into stereoscopic video Download PDF

Info

Publication number
CN104506872B
CN104506872B CN201410697508.1A CN201410697508A CN104506872B CN 104506872 B CN104506872 B CN 104506872B CN 201410697508 A CN201410697508 A CN 201410697508A CN 104506872 B CN104506872 B CN 104506872B
Authority
CN
China
Prior art keywords
depth map
video
depth
dimensional
pixel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410697508.1A
Other languages
Chinese (zh)
Other versions
CN104506872A (en
Inventor
张新
柯家琪
廖智宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Youshu Technology Co., Ltd.
Original Assignee
SHENZHEN KAIAOSI TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN KAIAOSI TECHNOLOGY Co Ltd filed Critical SHENZHEN KAIAOSI TECHNOLOGY Co Ltd
Priority to CN201410697508.1A priority Critical patent/CN104506872B/en
Publication of CN104506872A publication Critical patent/CN104506872A/en
Application granted granted Critical
Publication of CN104506872B publication Critical patent/CN104506872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/15Processing image signals for colour aspects of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/111Transformation of image signals corresponding to virtual viewpoints, e.g. spatial image interpolation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a kind of method and device of converting plane video into stereoscopic video, methods described includes performing following steps to each two field picture of planar video:S1, the depth map D for obtaining current frame image:First depth map D is obtained by the estimation based on Block- matching1;By edge detection algorithm and Hough transformation algorithm, the second depth map D is built based on geometrical perspective relation2;3rd depth map D is estimated by the method based on colouring information3;To D1、D2And D3Perform depth integration and obtain depth map D;S2, based on DIBR algorithms, by generating multi-viewpoint three-dimensional view with reference to figure and depth map D;S3, the three-dimensional video-frequency call format according to user, at least partly described right and left eye view execution solid is chosen from multi-viewpoint three-dimensional view and is rendered, to generate the color solid video of corresponding format.The three-dimensional video-frequency of the method generation of the present invention, not only stereoeffect is good, and the three-dimensional video-frequency of different-format can be generated according to user's request.

Description

A kind of method and device of converting plane video into stereoscopic video
Technical field
The present invention relates to a kind of method and device of converting plane video into stereoscopic video.
Background technology
Converting plane video into stereoscopic video technology, also known as 2D turn 3D technology, refer to existing planar video using necessary Technological means, fully excavates the depth information in planar video, and the virtual scene that multiple views are observed is simulated according to depth information, from And reach the effect of three-dimensional perception.In three-dimensional video-frequency technology, three-dimensional perceived effect is realized by binocular stereo vision.Binocular is stood Body vision utilizes the principle of binocular imaging, and picture is perceived as by simulating binocular, and it is special that left and right two-way image or video are used The right and left eyes of people that project respectively of means in, the brain of people can reconstruct the stereo scene in image or video, reach solid Perceived effect.
Stereo-picture as a kind of mode of new description three-dimensional world, it not only comprising conventional planar image on The surface information of scene, but also include the 3 D stereo information related to scene particular location, i.e. depth information.With it is traditional Planar video is compared, and three-dimensional video-frequency can carry out more real reflection to the concrete scene of objective world.
The depth map of scene can be obtained by stereoscopic vision algorithm, depth map can reflect plane picture correspondence scene Front and rear or far and near relation.Generally in the application, depth map is represented with the gray level image of 8 locating depths.On depth map certain The value of point is 0, represents that the image of the corresponding flat image point is located at certain point on the farthest in the range of relative depth, depth map Depth value is 255, represent to should be on location point plane picture be located in the range of relative depth most nearby, in 0~255 model Other values in enclosing then represent the depth in the range of some relative depth.
Current 3D video techniques have been achieved for more long-range development, and in the market, which is occurred in that from high-end, to low side one is Row stereoscopic video acquisition equipment.By technological accumulation for many years, the price of stereoscopic display device is also gradually popular, and 3D TVs are opened Beginning comes into increasing average family.But the behind of 3D Industry Prosperities in recent years, there is high-end stereoscopic acquisition equipment and hold high Expensive, high-quality three-dimensional film source shortage, the problem such as artificial 3D video productions are with high costs, these problems are increasingly becoming the development of 3D videos Bottleneck.
In addition, there is also the stereoscopic display device based on a variety of displaying principles, such as raster pattern, shutter, polarization in the market Formula etc..Usual shutter and polarization type show that establishing needs to wear special anaglyph spectacles and could watched that the display of raster pattern is set It is standby then do not need Special spectacles to be presented stereo scene, but the three-dimensional video-frequency lattice that the stereoscopic display device of raster pattern is supported Formula has point of binocular form and multiple views form, and existing three-dimensional video-frequency processing unit is often only capable of exporting a certain form Three-dimensional video-frequency, and the stereoeffect of the three-dimensional video-frequency exported is poor, significantly limit the use of three-dimensional video-frequency processing unit Scope.
The content of the invention
It is existing to solve it is a primary object of the present invention to propose a kind of method and device of converting plane video into stereoscopic video Three-dimensional video-frequency processing unit output the technical problem that three-dimensional video-frequency form is single and stereoeffect is not good enough.
The method of converting plane video into stereoscopic video proposed by the present invention is as follows:
A kind of method of converting plane video into stereoscopic video, including following steps are performed to each two field picture of planar video:
S1, the depth map D for obtaining current frame image:First depth map D is obtained by the estimation based on Block- matching1; The end point and vanishing line in current frame image are extracted by edge detection algorithm and Hough transformation algorithm, according to picture depth with Relation between end point and vanishing line builds the second depth map D2;3rd depth map is estimated by the method based on colouring information D3;To the first depth map D1, the second depth map D2With the 3rd depth map D3Depth integration is performed, to obtain State the depth map D of current frame image;
S2, based on DIBR algorithms, by generating multi-viewpoint three-dimensional view with reference to figure and the depth map D, wherein the reference Figure is the current frame image, and the multi-viewpoint three-dimensional view includes multipair right and left eye view;
S3, according to the three-dimensional video-frequency output format of user require, choose at least one pair of from the multi-viewpoint three-dimensional view The right and left eye view performs solid and rendered, to generate the color solid video of corresponding format.
The method of above-mentioned converting plane video into stereoscopic video, different depth are obtained to each two field picture using different methods Figure, then the depth map that these are obtained in different ways is weighted fusion, obtain a final depth of each two field picture Figure, then based on the final depth map and the two field picture, multi-viewpoint three-dimensional view is generated using DIBR algorithms, perform three-dimensional wash with watercolours Three-dimensional video-frequency is generated after dye.Due to trying to achieve multiple depth maps by distinct methods to single-frame images in this programme, then carry out depth Degree fusion obtains final depth map, and follow-up processing, therefore the solid finally given are performed based on the final depth map The stereoeffect of video is good, and it is multi-viewpoint three-dimensional view to be additionally, since obtain, can therefrom select different views to (one Individual view is to including left-eye image and eye image), the three-dimensional video-frequency of different-format is generated, for example, selecting wherein a certain viewpoint A pair of three-dimensional views, generate red blue form three-dimensional video-frequency, binocular form three-dimensional video-frequency or side-by-side form stereopsis Frequently;It can also select multipair three-dimensional view, generation multiple views form three-dimensional video-frequency or row interleaving format three-dimensional video-frequency, user can be with The three-dimensional video-frequency that solid renders the corresponding format to obtain is carried out according to the displaying principle of its stereoscopic display device, for different displays The stereoscopic display device of principle is shown.
The device of converting plane video into stereoscopic video proposed by the present invention is as follows:
A kind of device of converting plane video into stereoscopic video, including control module, cache module, video conversion module and solid Rendering module;The cache module rgb video pending for storing and the intermediate result of processing;The Video Quality Metric mould Block is connected with the cache module, the three-dimensional rendering module respectively, for by the plane picture of the pending rgb video Multi-viewpoint three-dimensional view is converted to, and the multi-viewpoint three-dimensional view is inputted to the three-dimensional rendering module, the multiple views Three-dimensional view includes multipair right and left eye view;The three-dimensional rendering module is used to be required according to the three-dimensional video-frequency output format of user The right and left eye view of at least one pair of is chosen from the multi-viewpoint three-dimensional view, and the right and left eye view of selection is held Row solid is rendered, and generates the color solid video of corresponding format;The control module respectively with the video conversion module, described Three-dimensional rendering module connection, for being required to configure described device according to user, the user requires to include the solid Video output formats requirement.
The device for the above-mentioned converting plane video into stereoscopic video that the present invention is provided compared with prior art, with advantages below:Can With according to three-dimensional video-frequency call format of the user to output, from the multipair right and left eye view of multi-viewpoint three-dimensional view, carry out not Same selection, carries out solid by the three-dimensional view of selection and renders, generate the three-dimensional video-frequency of corresponding format, the present apparatus disclosure satisfy that not With the stereoscopic display device of displaying principle, the scope of application is extremely wide.
Brief description of the drawings
Fig. 1 is a kind of method flow diagram for converting plane video into stereoscopic video that the specific embodiment of the invention is provided;
Fig. 2 is the particular flow sheet of the step 40 in Fig. 1;
Fig. 3 is the schematic diagram that edge detection algorithm is realized in FPGA;
Fig. 4 is the schematic diagram that Hough transformation algorithm is realized in FPGA;
Fig. 5 is that the schematic diagram that bilateral filtering is realized in FPGA is carried out to depth map D;
Fig. 6 is a kind of device block diagram for converting plane video into stereoscopic video that the specific embodiment of the invention is provided;
Fig. 7 is a kind of operation principle block diagram of specific embodiment of video conversion module in Fig. 6;
Fig. 8 is a kind of operation principle block diagram of specific embodiment of video input module in Fig. 6;
Fig. 9 is a kind of operation principle block diagram of specific embodiment of Video Output Modules in Fig. 6;
Figure 10 is a kind of operation principle block diagram of specific embodiment of cache module in Fig. 6.
Embodiment
The invention will be further described with reference to the accompanying drawings and detailed description.
The embodiment of the present invention provides a kind of method of converting plane video into stereoscopic video, and this method is using FPGA as core Heart processing apparatus, hardware design is realized by FPGA, and this method is included to each frame figure in pending video (planar video) As performing following steps, Fig. 1 is referred to:
Step 10:Start
Step 21:First depth map D is obtained by the estimation based on Block- matching1
Step 22:End point and the disappearance in current frame image are extracted by edge detection algorithm and Hough transformation algorithm Line, the second depth map D is built according to the relation between picture depth and end point and vanishing line2
Step 23:3rd depth map D is estimated by the method based on colouring information3
Step 30:To the first depth map D1, the second depth map D2With the 3rd depth map D3Depth integration is performed, obtains described The depth map D of current frame image
Step 40:Based on DIBR (Depth Image Based Rendering, the drafting based on depth image) algorithm, By a width with reference to figure and width depth map D generation multi-viewpoint three-dimensional views, wherein described is the current frame image with reference to figure
Step 50:According to the three-dimensional video-frequency call format of user, chosen at least partly from the multi-viewpoint three-dimensional view View performs solid and rendered, to generate the color solid video of corresponding format
It is to be appreciated that in above-mentioned step, step 21,22 and 23 can be performed simultaneously.
For step 21, a kind of specific algorithm FSBMA (Full-search block matching algorithm) can be used, including:Assuming that pair One current frame image I1The step 21 is performed, then also needs to extract the previous frame image I of current frame image2As reference frame, to current Frame and reference frame, using the estimation based on Block- matching, calculate the first motion vector, according to current frame image I1With the first fortune Dynamic vector obtains prediction two field picture Ipre;Again to predict two field picture IpreIt is used as reference frame, the previous frame image I2As current Frame, calculates the second motion vector in foregoing method, the current frame image I is obtained according to second motion vector1 One depth map D1, and the first depth map D1In the gray value each put be I1Previous frame image I2In each pixel motion arrow The modulus value of amount.Specifically, carry out foregoing estimation ask for motion vector can be hard-wired average exhausted using being more suitable for Angle value criterion (MAD) block matching criterion, i.e.,:
In above-mentioned formula (1), I1(x, y) is the grey scale pixel value of present frame, I2(x+u, y+v) is the pixel of reference frame Gray value, N is the macroblock size of selection, and (x, y) represents motion vector (or displacement vector), and (u, v) represents the picture in macro block Plain coordinate.Try to achieve foregoing the first motion vector and the second motion vector respectively in the method for formula (1).
For step 22, the edge detection algorithm uses the horizontal operator of sobel operators, respectively detection level edgeAnd the vertical operator of detection vertical edgeThe edge detection algorithm in FPGA Realization principle is as shown in figure 3, input original image (i.e. described current frame image), is calculated using above-mentioned horizontal operator with vertical Son, performs level (i.e. laterally) rim detection and vertical (i.e. longitudinal) rim detection respectively, then by transverse edge detection image with Longitudinal edge detection image carries out gradient combination, carries out threshold processing, exports edge-detected image.
The Hough transformation algorithm represents straight line using polar equation, and specific equation is as follows:
ρ=xcos θ+ysin θ, 0≤θ < 180 (2)
In above-mentioned formula (2), ρ represents origin to the vertical line distance of straight line, and θ is vertical line and x-axis side of the origin to straight line To angle, x be pixel with respect to origin row coordinate, y for pixel with respect to origin row coordinate.Specific reality in FPGA Existing principle as shown in figure 4, using the edge-detected image obtained in Fig. 3 as Hough transformation algorithm input, according in Fig. 4 Calculation process, exports straight line parameter, obtains vanishing line, and according to geometrical perspective relation, " intersection point of vanishing line is end point, end point For the point that depth is maximum, picture depth changes to minimum along vanishing line from maximum ", obtain the second depth of the current frame image Scheme D2
, can be according to being implemented as described below for step 23:To the current frame image, calculate each pixel blue component and The difference (being designated as the first difference) of red component, the difference (being designated as the second difference) of blue component and green component, then by One difference does product calculating with the second difference, and obtained result is used as the 3rd depth map D3The pixel value of middle respective pixel, so that Form the 3rd complete depth map D3
Structure design is calculated using hardware concurrent in FPGA, parallel processing array is constituted simultaneously in multiple processing units Calculate the first depth map D1, the second depth map D2With the 3rd depth map D3, improve real-time.
For step 30, depth map Weighted Fusion, the ultimate depth figure D required for obtaining, the side of implementing can be used Method is as follows:
To the first depth map D1, the second depth map D2With the 3rd depth map D3Perform depth map Weighted Fusion D=α D1+βD2+ γD3, wherein alpha+beta+γ=1, and different α, β, γ values are configured according to different video scenes, in this way, based on depth map D The image definition of the three-dimensional video-frequency generated through subsequent treatment is high and stereoeffect is good.For example, artificial scene in current frame image When more, weight coefficient is configured to artificial scene, i.e. the < γ < 0.1 of 0.5 < α <, 1,0.2 < β < 0.5,0, for example:α= 0.6875, β=0.25, γ=0.0625;When natural scene is more in current frame image, weight coefficient is configured to natural scene, That is the < γ < 0.5 of 0.5 < α <, 1,0 < β < 0.1,0.2, for example:α=0.6875, β=0.0625, γ=0.25.
For step 40, using drafting (DIBR) algorithm based on depth image, DIBR is used as using the current frame image Needed for algorithm with reference to figure, depth map D is carried out after bilateral filtering as the depth map needed for the algorithm so that, by width ginseng Examine figure and width depth map generation multi-viewpoint three-dimensional view.Detailed process is as follows:With reference to Fig. 2, depth map D is carried out first Bilateral filtering as shown in Figure 5 is to obtain more smooth depth map, and the formula of bilateral filtering is as follows,
In above-mentioned formula (3), BF [I]pDepth map as after bilateral filtering, WpRepresent normalized parameter (with by depth Value is converted between 0~255), Gσs、GσrRepresent the Gaussian function using σ s, σ r as standard deviation, Ip、IqRepresent depth map D's respectively Pixel p, pixel q grey scale pixel value, S represent pixel p neighborhood.As shown in fig. 7, input depth map D, is calculated respectively The Gauss weighted index of similarity degree between pixel | Ip-Iq| and space length Gauss weight | | p-q | |, using above-mentioned formula (3) carry out using negative exponent computing module during bilateral filtering, CORDIC (Coordinate Rotation are used in FPGA Digital Computer, Coordinate Rotation Digital is calculated) algorithm calculates the hyperbolic cosine and hyperbolic sine of variate-value, hyperbolic respectively Cosine result subtracts hyperbolic sine string result and obtains negative exponential function result of calculation.In view of facilitating FPGA to realize and taking into account conversion S takes p eight neighborhood in the effect of the three-dimensional video-frequency gone out, this example, and σ s take one of them in 4,8,16,32, σ r take 0.5,0.25, 0.125th, one of them in 0.0625.
Smooth depth map BF [I] is obtained by above-mentioned bilateral filteringpAfterwards, based on the current frame image (with reference to figure) and The depth map BF [I]pMapped to generate multi-viewpoint three-dimensional view using image mapping equation, refer to Fig. 2, the figure As mapping equation is as follows:
In above-mentioned formula (4-1) and (4-2), xcRepresent the pixel abscissa with reference to figure, xlRepresent the picture of left-eye view Vegetarian refreshments abscissa, xrRepresent the pixel abscissa of right-eye view;txThe parallax range of right and left eye view is represented, changes tx Change the parallax of right and left eye view, f is right and left eye view virtual video camera focal length, in this example, it is at respective pixel to take f=1, Z Depth value.By changing parallax range tx, multipair right and left eye view is resulted in, is regarded so as to form the multi-viewpoint three-dimensional Figure.Necessary cavity then can be carried out to the multi-viewpoint three-dimensional view using mean filter to fill and repair.
For step 50, specifically, for example, can be:Adding user needs to obtain red blue form or side-by-side The three-dimensional video-frequency of form, then select a pair of right and left eye views from multi-viewpoint three-dimensional view, carries out solid and renders;If needed Multi-viewpoint three-dimensional video or row intertexture three-dimensional video-frequency are obtained, then needs to choose multipair right and left eye view and carries out solid and render. So as to generate the color solid video of multiple format.
The embodiment of the present invention also provides a kind of device of converting plane video into stereoscopic video, as shown in fig. 6, the dress Put and be mainly based upon FPGA to build each operational module, the device include control module, cache module, video conversion module and Three-dimensional rendering module;The cache module rgb video pending for storing and the intermediate result of processing;The video turns Mold changing block is connected with the cache module, the three-dimensional rendering module respectively, for by the plane of the pending rgb video Image is converted to multi-viewpoint three-dimensional view, and the multi-viewpoint three-dimensional view is inputted to the three-dimensional rendering module, described many Viewpoint three-dimensional view includes multipair right and left eye view;The three-dimensional rendering module is used for the three-dimensional video-frequency output format according to user It is required that choosing the corresponding right and left eye view from the multi-viewpoint three-dimensional view, and the right and left eye view of selection is held Row solid is rendered, and generates the color solid video of corresponding format;The control module respectively with the video conversion module, described Three-dimensional rendering module connection, for being required to configure described device according to user, the user requires to include the solid Video output formats requirement.
With reference to Fig. 7, in some specific embodiments, the video conversion module can include the first depth estimation mould Block, the second depth estimation module, the 3rd depth estimation module, depth integration module and multi-viewpoint three-dimensional view generation module, institute State the first depth estimation module, the second depth estimation module and the 3rd depth estimation module with the cache module With depth integration module connection.
The first depth estimation module, which is used to perform a current frame image of the pending rgb video, is based on block The estimation of matching, to obtain the first depth map D1, step 21 of the concrete implementation method with reference to preceding method.
The second depth estimation module, which is used to performing geometrical perspective relation to the current frame image, to be estimated, to obtain the Two depth map D2, concrete implementation mode is with reference to foregoing step 22 and Fig. 3, Fig. 4.
The 3rd depth estimation module is used to perform the estimation based on colouring information to the current frame image, to obtain 3rd depth map D3, concrete implementation mode is with reference to foregoing step 23.
The depth integration module is used for the first depth map D1, the second depth map D2 and the 3rd depth Scheme D3 and perform depth map Weighted Fusion, to obtain the depth map D of the current frame image, concrete implementation mode is with reference to foregoing Step 30.
The multi-viewpoint three-dimensional view generation module is used to be based on the depth map D and the current frame image, generates institute State multi-viewpoint three-dimensional view.As shown in fig. 7, generating multi-viewpoint three-dimensional view after being handled through the video conversion module, then pass through The three-dimensional rendering module, corresponding right and left eye view progress solid is selected from multi-viewpoint three-dimensional view and is rendered, can be generated not With the three-dimensional video-frequency of form, the method for referring to the converting plane video into stereoscopic video of foregoing offer will not be repeated here.
In some specific embodiments, with reference to Fig. 6, described device may also include video input module and video frequency output mould Block:The video input module is connected with the cache module, is regarded for inputting the pending RGB to the cache module Frequently;The Video Output Modules are connected with the three-dimensional rendering module, for by the color solid video frequency output;The control Module include data configuration module and man-machine communication module, wherein, the data configuration module respectively with the Video Quality Metric mould Block, the three-dimensional rendering module, the video input module, the Video Output Modules, man-machine communication module connection.More Plus preferably, the man-machine communication module includes man-machine interface and host computer communication module, wherein man-machine interface can include pressing Keyboard plate and remote control module, user can be completed to the dress by key panel, remote control module or host computer communication module The configuration put.
In more preferred embodiment, with reference to Fig. 8, the video input module specifically includes vision signal TIP 100th, video input converter group 300 and input signal selector 500, wherein, the input signal selector 500 with it is described Data configuration module is connected;With reference to Fig. 9, the Video Output Modules specifically include output signal selection device 200, video frequency output and turned Parallel operation group 400 and vision signal output slab 600, wherein, the output signal selection device connects with the data configuration module Connect;As shown in FIG. 8 and 9, the vision signal TIP 100 and the vision signal output slab 600 include a variety of Video interface, multiple interfaces such as AV interfaces, bnc interface, USB interface, the video input converter group 300 and described Video frequency output converter group 400 include respectively correspond to every kind of video interface multiple converters, such as AV converters, BNC converters, VGA converters etc.;As shown in Fig. 6,8 and 9, the man-machine communication module is used to input user's requirement, institute Data configuration module is stated to be required respectively to the input signal selector 500, the signal output selector according to the user 200th, the video conversion module, the three-dimensional rendering module carry out the configuration.Citing:Bnc interface will be come from if desired Planar video be converted to three-dimensional video-frequency, and played out with the output of AV interface shapes, then user inputs from man-machine communication module Configuration data, makes data configuration module configure input signal selector 500, with from the video input converter group BNC converters are selected in 300, the planar video that bnc interface is inputted is changed, the video conversion module is converted to general 24bit rgb videos (plane), then input to the cache module or input buffer module and video conversion module simultaneously, The method for performing foregoing converting plane video into stereoscopic video, user is according to stereoscopic display device used, to three-dimensional rendering module Data configuration is carried out, to render the form matched with the stereoscopic display device used in it (such as red blue form, multiple views lattice Formula, side-by-side forms, row interleaving format etc.) color solid video, user carries out to output signal selection device 200 again Data configuration, makes output signal selection device 200 select AV converters from the video frequency output converter group 400, by solid The color solid Video Quality Metric of rendering module output is into the three-dimensional video-frequency with AV Interface Matchings, you can output is played.
User can be wrapped by man-machine communication module to data configuration module input data with carrying out configuration to described device Various configurations are included, in addition to described above, it is also an option that the video scene (artificial scene or natural scene) is vertical to adjust The effect of body conversion.Wherein data configuration module is that the configuration data for being responsible for inputting user is configured to corresponding module.
The remote control module is for example including Digiplex and remote control signal receiving module, and user can pass through remote control module Data configuration is carried out, the configuration of video signal input interface is completed at a distance using Digiplex, vision signal output connects The configuration of mouth, the selection of the three-dimensional video-frequency form of output, the regulation of three-dimensional video-frequency conversion effect, the choosing of video scene translative mode The operation such as select.The key panel includes the sub- button of multiple functions, and user can be inputted by corresponding key configurations vision signal Interface, configures video signal output interface, and configuration output three-dimensional video-frequency form, regulation three-dimensional video-frequency conversion effect etc. is operated.
The host computer communication module can be communicated with certain means of communication with PC host computers, be communicated by host computer Module, can be transferred to host computer by the current status information of device, can also on host computer by the module complete to dress The configuration put.
The cache module includes the outer Large Copacity SDRAM of storage control module and piece in FPGA pieces, with reference to Figure 10, is used for Cache the 24bit RGB digital video signals and transfer algorithm of front-end module (example video input module as the aforementioned) input The data such as intermediate result and final result.The storage control module uses FPGA ram in slice, FIFO control modules, SDRAM controls The asynchronous high-capacity FIFO structure caching of module composition processed.In addition, being realized by the sdram controller in the storage control module To SDRAM random read-write.
Above content is to combine specific preferred embodiment further description made for the present invention, it is impossible to assert The specific implementation of the present invention is confined to these explanations.For those skilled in the art, do not taking off On the premise of from present inventive concept, some equivalent substitutes or obvious modification can also be made, and performance or purposes are identical, all should When being considered as belonging to protection scope of the present invention.

Claims (7)

1. a kind of method of converting plane video into stereoscopic video, including following steps are performed to each two field picture of planar video:
S1, the depth map D for obtaining current frame image:First depth map D is obtained by the estimation based on Block- matching1;Pass through side Edge detection algorithm and Hough transformation algorithm extract end point and vanishing line in current frame image, according to picture depth and end point Relation between vanishing line builds the second depth map D2;3rd depth map D is estimated by the method based on colouring information3;To institute State the first depth map D1, the second depth map D2With the 3rd depth map D3Depth integration is performed, to obtain the present frame The depth map D of image;Depth map D is carried out bilateral filtering to obtain smoother depth map BF [I]p
S2, based on DIBR algorithms, by with reference to figure and depth map BF [I]pMulti-viewpoint three-dimensional view is generated, wherein described be with reference to figure The current frame image, the multi-viewpoint three-dimensional view includes multipair right and left eye view;
S3, the three-dimensional video-frequency call format according to user, choose at least one pair of described left and right from the multi-viewpoint three-dimensional view Eye view performs solid and rendered, to generate the color solid video of corresponding format;
Wherein, the formula of the bilateral filtering is as follows:
In above-mentioned formula, BF [I]pFor the depth map after bilateral filtering, WpIt is for depth value to be converted between 0~255 Normalized parameter, Gσs、GσrRepresent the Gaussian function using σ s, σ r as standard deviation, Ip、IqRepresent respectively depth map D pixel p, Pixel q grey scale pixel value, | Ip-Iq| the Gauss weighted index of the similarity degree between pixel p, q is represented, | | p-q | | represent The Gauss weight of space length between pixel p, q, S represents pixel p neighborhood, and wherein S takes p eight neighborhood, σ s take 4,8,16, One of them in 32, σ r take one of them in 0.5,0.25,0.125,0.0625;
Negative exponent computing module is used when carrying out the bilateral filtering using above-mentioned formula, Coordinate Rotation Digital meter is used in FPGA Hyperbolic cosine and hyperbolic sine that algorithm calculates variate-value respectively are calculated, hyperbolic cosine result subtracts hyperbolic sine result and obtains negative refer to Number function result of calculation.
2. the method as described in claim 1, it is characterised in that:The first depth map D is obtained in the step S11Specific bag Include:The two continuous frames image extracted in caching is respectively I1、I2, with I1It is used as present frame, I2It is used as reference frame, wherein I2For I1's Former frame, by the estimation based on Block- matching, calculates the first motion vector, according to present frame I1With the described first motion arrow Amount, obtains prediction frame Ipre;Again to predict frame IpreIt is used as reference frame, I2As present frame, estimated by the motion based on Block- matching Meter, calculates the second motion vector, according to second motion vector, obtains the first depth map D1, wherein, described first is deep Degree figure D1In the gray value each put be I2In each pixel motion vector modulus value.
3. method as claimed in claim 2, it is characterised in that:The estimation is carried out using parallel algorithm in FPGA Design of Hardware Architecture, and constitute parallel processing array to calculate first depth map simultaneously in FPGA multiple processing units D1, the second depth map D2With the 3rd depth map D3
4. the method as described in claim 1, it is characterised in that:Picture depth described in the step S1 and end point and disappearance Relation between line is:The intersection point of vanishing line is end point, and end point is the maximum point of depth, and picture depth is along vanishing line from most Change to minimum greatly.
5. the method as described in claim 1, it is characterised in that:The 3rd depth image D is obtained in the step S13Specifically Including:Calculate in the current frame image the first difference of the blue component of each pixel and red component, blue component with it is green Second difference of colouring component, each picture of the 3rd depth map is used as using the product of first difference and second difference The pixel value of element, so as to obtain the 3rd depth map D3
6. the method as described in claim 1, it is characterised in that:It is described to obtain that the depth integration is carried out in the step S1 Depth map D is specifically included:To the first depth map D1, the second depth map D2With the 3rd depth map D3Perform depth Figure Weighted Fusion D=α D1+βD2+γD3, wherein alpha+beta+γ=1.
7. method as claimed in claim 6, it is characterised in that:The video scene includes artificial scene and natural scene, when When video scene is artificial field scape, the < γ < 0.1 of 0.5 < α <, 1,0.2 < β < 0.5,0;When video scene is natural scene, The < γ < 0.5 of 0.5 < α <, 1,0 < β < 0.1,0.2.
CN201410697508.1A 2014-11-26 2014-11-26 A kind of method and device of converting plane video into stereoscopic video Active CN104506872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410697508.1A CN104506872B (en) 2014-11-26 2014-11-26 A kind of method and device of converting plane video into stereoscopic video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410697508.1A CN104506872B (en) 2014-11-26 2014-11-26 A kind of method and device of converting plane video into stereoscopic video

Publications (2)

Publication Number Publication Date
CN104506872A CN104506872A (en) 2015-04-08
CN104506872B true CN104506872B (en) 2017-09-29

Family

ID=52948579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410697508.1A Active CN104506872B (en) 2014-11-26 2014-11-26 A kind of method and device of converting plane video into stereoscopic video

Country Status (1)

Country Link
CN (1) CN104506872B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105611271A (en) * 2015-12-18 2016-05-25 华中科技大学 Real-time stereo image generating system
CN106023299B (en) * 2016-05-04 2019-01-04 上海玮舟微电子科技有限公司 A kind of virtual view method for drafting and system based on depth map
CN106060522A (en) * 2016-06-29 2016-10-26 努比亚技术有限公司 Video image processing device and method
CN106791770B (en) * 2016-12-20 2018-08-10 南阳师范学院 A kind of depth map fusion method suitable for DIBR preprocessing process
WO2018231087A1 (en) 2017-06-14 2018-12-20 Huawei Technologies Co., Ltd. Intra-prediction for video coding using perspective information
US10403032B2 (en) * 2017-08-22 2019-09-03 Qualcomm Incorporated Rendering an image from computer graphics using two rendering computing devices
CN112700485B (en) * 2020-12-31 2023-02-07 重庆电子工程职业学院 Image depth information extraction method

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789123B (en) * 2010-01-27 2011-12-07 中国科学院半导体研究所 Method for creating distance map based on monocular camera machine vision
CN102223553B (en) * 2011-05-27 2013-03-20 山东大学 Method for converting two-dimensional video into three-dimensional video automatically
JP2013172214A (en) * 2012-02-17 2013-09-02 Sony Corp Image processing device and image processing method and program
CN102750711B (en) * 2012-06-04 2015-07-29 清华大学 A kind of binocular video depth map calculating method based on Iamge Segmentation and estimation
CN103220545B (en) * 2013-04-28 2015-05-06 上海大学 Hardware implementation method of stereoscopic video real-time depth estimation system
CN103248909B (en) * 2013-05-21 2015-05-20 清华大学 Method and system of converting monocular video into stereoscopic video
CN104052990B (en) * 2014-06-30 2016-08-24 山东大学 A kind of based on the full-automatic D reconstruction method and apparatus merging Depth cue

Also Published As

Publication number Publication date
CN104506872A (en) 2015-04-08

Similar Documents

Publication Publication Date Title
CN104506872B (en) A kind of method and device of converting plane video into stereoscopic video
CN101287143B (en) Method for converting flat video to tridimensional video based on real-time dialog between human and machine
CN102254348B (en) Virtual viewpoint mapping method based o adaptive disparity estimation
US20120139906A1 (en) Hybrid reality for 3d human-machine interface
CN111047709B (en) Binocular vision naked eye 3D image generation method
EP2774124A1 (en) Depth-map generation for an input image using an example approximate depth-map associated with an example similar image
CN102724531B (en) A kind of two-dimensional video turns the method and system of 3 D video
CN103248909B (en) Method and system of converting monocular video into stereoscopic video
CN101771893A (en) Video frequency sequence background modeling based virtual viewpoint rendering method
CN111612878B (en) Method and device for making static photo into three-dimensional effect video
CN103220545A (en) Hardware implementation method of stereoscopic video real-time depth estimation system
Li et al. A real-time high-quality complete system for depth image-based rendering on FPGA
CN106028020B (en) A kind of virtual perspective image cavity complementing method based on multi-direction prediction
Bleyer et al. Temporally consistent disparity maps from uncalibrated stereo videos
CN101662695B (en) Method and device for acquiring virtual viewport
CN102026012B (en) Generation method and device of depth map through three-dimensional conversion to planar video
CN112927348B (en) High-resolution human body three-dimensional reconstruction method based on multi-viewpoint RGBD camera
CN106169179A (en) Image denoising method and image noise reduction apparatus
CN103945206B (en) A kind of stereo-picture synthesis system compared based on similar frame
CN110149508A (en) A kind of array of figure generation and complementing method based on one-dimensional integrated imaging system
Akin et al. Real-time free viewpoint synthesis using three-camera disparity estimation hardware
CN106998460B (en) A kind of hole-filling algorithm based on depth transition and depth item total variational
Knorr et al. From 2D-to stereo-to multi-view video
CN113450274A (en) Self-adaptive viewpoint fusion method and system based on deep learning
CN113132706A (en) Controllable position virtual viewpoint generation method and device based on reverse mapping

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20190424

Address after: 310000 Room 702, 7th Floor, 15 Yinhu Innovation Center, No. 9 Fuxian Road, Yinhu Street, Fuyang District, Hangzhou City, Zhejiang Province

Patentee after: Hangzhou Youshu Technology Co., Ltd.

Address before: 518000 Shenzhen Nanshan District Shekou Street Park South Road Nanshan Internet Innovation and Creative Service Base A303

Patentee before: SHENZHEN KAIAOSI TECHNOLOGY CO., LTD.

TR01 Transfer of patent right