CN108986197A

CN108986197A - 3D skeleton line construction method and device

Info

Publication number: CN108986197A
Application number: CN201711244255.2A
Authority: CN
Inventors: 张�杰; 毛河; 龙学军; 周剑
Original assignee: Chengdu Tongjia Youbo Technology Co Ltd
Current assignee: Chengdu Tongjia Youbo Technology Co Ltd
Priority date: 2017-11-30
Filing date: 2017-11-30
Publication date: 2018-12-11
Anticipated expiration: 2037-11-30
Also published as: CN108986197B

Abstract

The present invention relates to technical field of computer vision, a kind of 3D skeleton line construction method and device are provided firstly, the original image of photographic device acquisition is inputted trained convolutional neural networks in advance and obtains 2D skeleton line；Then, original image is corrected using nominal data with remove distortion, and to after correction the first image and the second image carry out binocular solid matching, obtain depth map；Finally, 2D skeleton line and depth map are combined, 3D skeleton line is rendered, thus by the application extension of skeleton line to 3D, to carry out body feeling interaction game.Compared with prior art, the binocular ranging method that the embodiment of the present invention utilizes, the requirement to equipment is lower, at low cost, has good practicability.

Description

3D skeleton line construction method and device

Technical field

The present invention relates to technical field of computer vision, in particular to a kind of 3D skeleton line construction method and device.

Background technique

In existing skeleton line application, all compares the application for concentrating on 2D skeleton line, identified by 2D skeleton line application The 2D posture of object, but 2D posture cannot characterize the 3D posture such as rotation of human body well, therefore will in terms of somatic sensation television game It is limited.Simultaneously in existing business application, the acquisition for carrying out 3D data is all to pass through structure using physical equipments such as kinect The technologies such as light, TOF go to obtain the 3D information of body surface, these technologies need stronger hardware supported, high price, high power consumption, Large volume makes them be unsatisfactory for save the cost and can carry the requirement of aspect.

Summary of the invention

The embodiment of the present invention is designed to provide a kind of 3D skeleton line construction method and device, to improve above-mentioned ask Topic.

To achieve the goals above, technical solution used in the embodiment of the present invention is as follows:

In a first aspect, the embodiment of the invention provides a kind of 3D skeleton line construction method, applied to being provided with photographic device Electronic equipment, the photographic device include the first camera lens and the second camera lens, which comprises obtain the photographic device and adopt The original image of collection, wherein the original image include the acquisition of the first camera lens the first image and the acquisition of the second camera lens the Two images；The original image is inputted into trained convolutional neural networks in advance, obtains 2D skeleton line；Utilize nominal data pair The original image is corrected, to remove the distortion of the original image；To after correction the first image and the second image into The matching of row binocular solid, obtains depth map；The 2D skeleton line and the depth map are combined, 3D skeleton line is rendered.

Second aspect, the embodiment of the invention also provides a kind of 3D skeleton line construction devices, fill applied to camera shooting is provided with The electronic equipment set, the photographic device include the first camera lens and the second camera lens, and the 3D skeleton line construction device includes original Image collection module, 2D skeleton line obtain module, image correction module, stereo matching module and execution module.Wherein, original graph The original image for being used to obtain the photographic device acquisition as obtaining module, wherein the original image includes that the first camera lens is adopted First image of collection and the second image of the second camera lens acquisition；2D skeleton line obtains module for the original image is defeated Enter preparatory trained convolutional neural networks, obtains 2D skeleton line；Image correction module is used for using nominal data to the original Beginning image is corrected, to remove the distortion of the original image；Stereo matching module be used for after correction the first image and Second image carries out binocular solid matching, obtains depth map；Execution module be used for by the 2D skeleton line and the depth map into Row combines, and renders 3D skeleton line.

Compared with the prior art, a kind of 3D skeleton line construction method and device provided in an embodiment of the present invention, firstly, will camera shooting The original image of device acquisition inputs trained convolutional neural networks in advance, obtains 2D skeleton line；Then, nominal data is utilized Original image is corrected with remove distortion, and to after correction the first image and the second image carry out binocular solid matching, Obtain depth map；Finally, 2D skeleton line and depth map are combined, 3D skeleton line is rendered, thus by the application of skeleton line It is extended to 3D, to carry out body feeling interaction game.Compared with prior art, the binocular ranging method that the embodiment of the present invention utilizes, Requirement to equipment is lower, at low cost, has good practicability.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 shows the block diagram of electronic equipment provided in an embodiment of the present invention.

Fig. 2 shows 3D skeleton line construction method flow charts provided in an embodiment of the present invention.

Fig. 3 be Fig. 2 shows step S102 sub-step flow chart.

Fig. 4 be Fig. 2 shows step S104 sub-step flow chart.

Fig. 5 is the sub-step flow chart of the sub-step S1041 shown in Fig. 4.

Fig. 6 is the sub-step flow chart of the sub-step S1042 shown in Fig. 4.

Fig. 7 shows the block diagram of 3D skeleton line construction device provided in an embodiment of the present invention.

Fig. 8 obtains the block diagram of module for 2D skeleton line in the 3D skeleton line construction device shown in Fig. 7.

Fig. 9 is the block diagram of the 3D skeleton line construction device neutral body matching module shown in Fig. 7.

Figure 10 describes the block diagram of unit for feature in the stereo matching module shown in Fig. 9.

Figure 11 be Fig. 9 shown in stereo matching module in binocular parallax figure obtaining unit block diagram.

Icon: 100- electronic equipment；101- memory；102- storage control；103- processor；104- Peripheral Interface； 105- photographic device；106- display screen；200-3D skeleton line construction device；201- original image obtains module；202-2D skeleton Line obtains module；2021- artis detection unit；202- artis taxon；203- image correction module；204- three-dimensional With module；2041- feature describes unit；20411- the first binary string construction unit；20412- fisrt feature figure obtaining unit； 20413- the second binary string construction unit；20414- second feature figure obtaining unit；2042- binocular parallax figure obtaining unit； 20421- the first disparity map obtaining unit；20422- the second disparity map obtaining unit；20423- parallax noise canceling unit； 20424- Mismatching point eliminates unit；20425- initial parallax figure obtaining unit；20426- processing optimization unit；2043- depth Figure obtaining unit；205- execution module.

Specific embodiment

Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, is not intended to limit claimed invention to the detailed description of the embodiment of the present invention provided in the accompanying drawings below Range, but it is merely representative of selected embodiment of the invention.Based on the embodiment of the present invention, those skilled in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile of the invention In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.

Common 2D skeleton line algorithm can capture the variation of human skeleton during the motion, but since it does not have depth Degree information causes it not distinguish the front and back difference of skeleton various pieces, and human skeleton cannot be simulated in this way in somatic sensation television game Posture in the 3 d space.In the application of existing skeleton line, require to pass using Kinect etc. with depth in most cases The equipment of sensor carries out skeleton line application.This kind of equipment cost is high and inconvenient to carry, while being unable to satisfy small type mobile devices On use and field of employment need to fix indoors.

In view of this, it is corresponding that the 2D skeleton line is added by the discovery that studies for a long period of time in inventor on the basis of 2D skeleton line Depth information, so that it may render 3D skeleton line.Using the matched mode of binocular solid, skeleton line can be transplanted to installation There are mobile phone and the embedded device etc. of binocular camera, increases the application range of traditional skeleton line.

The present invention is described in further detail below by specific embodiment and in conjunction with attached drawing.

Fig. 1 is please referred to, Fig. 1 shows the block diagram of electronic equipment 100 provided in an embodiment of the present invention.Electronic equipment 100 may be, but not limited to, smart phone, tablet computer, pocket computer on knee, vehicle-mounted computer, personal digital assistant (personal digital assistant, PDA), wearable mobile terminal etc..The electronic equipment 100 includes 3D skeleton Line construction device 200, memory 101, storage control 102, processor 103, Peripheral Interface 104, photographic device 105 and display Screen 106.

The memory 101, storage control 102, processor 103, Peripheral Interface 104, photographic device 105 and display screen 106 each elements are directly or indirectly electrically connected between each other, to realize the transmission or interaction of data.For example, these element phases It can be realized and be electrically connected by one or more communication bus or signal wire between mutually.The 3D skeleton line construction device 200 wraps The electronics can be stored in the memory 101 or be solidificated in the form of software or firmware (firmware) by including at least one Software function module in the operating system (operating system, OS) of equipment 100.The processor 103 is for executing The executable module stored in memory 101, such as software function module or meter that the 3D skeleton line construction device 200 includes Calculation machine program.

Wherein, memory 101 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 101 is for storing program, and the processor 103 executes described program, this hair after receiving and executing instruction Method performed by the server for the flow definition that bright any embodiment discloses can be applied in processor 103, or by Device 103 is managed to realize.

Processor 103 can be a kind of IC chip, have signal handling capacity.Above-mentioned processor 103 can be with It is general processor, including central processing unit (Central Processing Unit, CPU), network processing unit (Network Processor, NP), speech processor and video processor etc.；Can also be digital signal processor, specific integrated circuit, Field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components. It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor can be Microprocessor or the processor 103 are also possible to any conventional processor etc..

The Peripheral Interface 104 is used to couple processor 103 and memory 101 for various input/output devices.? In some embodiments, Peripheral Interface 104, processor 103 and storage control 102 can be realized in one single chip.At it In his some examples, they can be realized by independent chip respectively.

Photographic device 105 includes the first camera lens and the second camera lens, the first mirror for acquiring original image, photographic device 105 Head is used to acquire the second image in original image for acquiring the first image in original image, the second camera lens.In this implementation In example, photographic device 105 may be, but not limited to, binocular camera or more mesh cameras etc..

Display screen 106 specifically may be, but not limited to, display for realizing the interaction between user and electronic equipment 100 Screen 106 shows the 3D skeleton line built.

First embodiment

Referring to figure 2., Fig. 2 shows 3D skeleton line construction method flow charts provided in an embodiment of the present invention.3D skeleton line Construction method the following steps are included:

Step S101 obtains the original image of photographic device acquisition, wherein original image includes the of the first camera lens acquisition One image and the second image of the second camera lens acquisition.

In embodiments of the present invention, photographic device 105 can select binocular camera, and binocular camera acquires original graph Picture returns to two corresponding images, the first image of respectively the first camera lens acquisition and the second figure of the second camera lens acquisition Picture, the first image can be the left figure in the original image of binocular camera acquisition, and the second image can be binocular camera and adopt Right figure in the original image of collection.

Original image is inputted trained convolutional neural networks in advance, obtains 2D skeleton line by step S102.

In embodiments of the present invention, original image can be inputted to trained convolutional neural networks in advance, pass through convolution The detection and classification of neural fusion human joint points, to obtain 2D skeleton line.Original image can be the first image or The second image of person.

As an implementation, the method that detection and the classification of human joint points are realized by convolutional neural networks, can To include: to input the first image or the second image in trained convolutional neural networks in advance, by the convolutional Neural net Network carries out human joint points detection to the first image or the second image, obtains the position coordinates of each artis of human body and right The artis label answered, to obtain 2D skeleton line.

Referring to figure 3., step S102 may include following sub-step:

Original image is inputted trained convolutional neural networks in advance, is examined by convolutional neural networks by sub-step S1021 Measure multiple artis.

In embodiments of the present invention, the multiple artis detected by convolutional neural networks, can be same human body Multiple artis, be also possible to be different human body multiple artis.

Sub-step S1022 classifies to multiple artis using convolutional neural networks, obtains 2D skeleton line.

In embodiments of the present invention, classified using convolutional neural networks to multiple artis, the artis of every one kind With the attribute for belonging to same human body, as an implementation, the attribute for belonging to same human body can be portrayed by affine domain, Belong between the artis of the same human body and exists apart from affinity relation, specifically, belonging between the artis of same human body Distance always within the scope of a certain, which is the affine domain of artis, the artis being always in same affine domain Belong to same people.

Step S103 is corrected original image using nominal data, to remove the distortion of original image.

In embodiments of the present invention, original image is corrected, that is, sharp respectively to the first image and the second image It is corrected with nominal data, removes the distortion of original image.

Step S104, to after correction the first image and the second image carry out binocular solid matching, obtain depth map.

In embodiments of the present invention, the first image and the second image are corrected, the first image after being corrected and After second image, firstly, to after correction the first image and the second image carry out feature description respectively, obtain fisrt feature figure With second feature figure；Then, fisrt feature figure and second feature figure are calculated by cost, cost polymerization, the inspection of left and right consistency It surveys, parallax post-processing, obtains binocular parallax figure；Finally, carrying out triangulation to binocular parallax figure, depth map is obtained.

Referring to figure 4., step S104 may include following sub-step:

Sub-step S1041, to after correction the first image and the second image carry out feature description respectively, obtain fisrt feature Figure and second feature figure.

As an implementation, feature description, the method for obtaining fisrt feature figure are carried out to the first image after correction It may include: firstly, constructing the first binary string using the size relation between pixel in the first image；Then, with this One binary string replaces the value of pixel, and the gray scale of pixel or color information are mapped as comprising pixel and its neighborhood The binary features of size relation, for example, for window of the following table comprising 9 pixels of 3*3, each neighbour in comparison window Domain pixel

123	127	129
			126	128	129
127	131	130

The size of the pixel value of the pixel value and central pixel point of point, if in the pixel value ratio of some neighborhood territory pixel point The pixel value of imago vegetarian refreshments is small, then comparison result is 1, if the pixel value of some neighborhood territory pixel point is than the picture of central pixel point Element value, then comparison result is 0, therefore the binary coding of available such as following table:

1	1	0
			1	0	0
1	0	0

Therefore, for the available binary string 110100100 of central pixel point, and with the binary string 110100100 Instead of the pixel value of central pixel point.For each of the first image pixel, can use around the pixel The pixel of 3*3 neighborhood obtains the first binary string for the pixel, and replaces corresponding pixel points with the first binary string Pixel value, can be obtained by fisrt feature figure in this way.Feature description is carried out to the second image after correction, obtains second feature The method of figure is identical as the method for obtaining fisrt feature figure, and details are not described herein.

Referring to figure 5., sub-step S1041 may include following sub-step:

Sub-step S10411 utilizes the size between each pixel in the first image after correction and its neighborhood territory pixel point Relationship constructs corresponding first binary string of each pixel.

In embodiments of the present invention, neighborhood can be 3*3 size, if in the first image any one pixel pixel Value is greater than the pixel value of its neighborhood territory pixel point, then is 1 with the comparison result of the neighborhood territory pixel point, if any one in the first image The pixel value of a pixel is less than the pixel value of its neighborhood territory pixel point, then is 0 with the comparison result of the neighborhood territory pixel point, later will Each comparison result is arranged successively, and can obtain corresponding first binary string of the pixel.

Sub-step S10412 obtains fisrt feature figure with the pixel value of each first binary string replacement corresponding pixel points.

Sub-step S10413 utilizes the size between each pixel in the second image after correction and its neighborhood territory pixel point Relationship constructs corresponding second binary string of each pixel.

In embodiments of the present invention, neighborhood can be 3*3 size, if in the second image any one pixel pixel Value is greater than the pixel value of its neighborhood territory pixel point, then is 1 with the comparison result of the neighborhood territory pixel point, if any one in the second image The pixel value of a pixel is less than the pixel value of its neighborhood territory pixel point, then is 0 with the comparison result of the neighborhood territory pixel point, later will Each comparison result is arranged successively, and can obtain corresponding second binary string of the pixel.

Sub-step S10414 obtains second feature figure with the pixel value of each second binary string replacement corresponding pixel points.

Sub-step S1042 is calculated fisrt feature figure and second feature figure by cost, cost polymerization, left and right consistency Detection, parallax post-processing, obtain binocular parallax figure.

As an implementation, fisrt feature figure and second feature figure are calculated by cost, cost polymerization, left and right one The detection of cause property, parallax post-processing, the method for obtaining binocular parallax figure may include:

1. Hamming distance is used to calculate the pixel in fisrt feature figure and disparity search model corresponding in second feature figure The matching cost for enclosing interior each pixel, obtains the first disparity map, and the first disparity map is to incite somebody to action using fisrt feature figure as with reference to figure Second feature figure carries out cost as target figure and calculates gained；And using Hamming distance calculate second feature figure in pixel with In fisrt feature figure within the scope of corresponding disparity search each pixel matching cost, obtain the second disparity map, the second parallax Figure is that fisrt feature figure is carried out cost as target figure and calculates gained using second feature figure as with reference to figure.

2. using cost polymerization, the parallax noise in the first disparity map and the second disparity map is eliminated, cost meter is passed through There are parallax noises for obtained the first disparity map and the second disparity map, it is therefore desirable to a pixel is established around pixel Window utilizes the consecutive points between pixel window so that being compared between block of pixels and block of pixels that pixel window includes The optimization for reaching and calculating cost is restricted, to eliminate parallax noise.Cost polymerization may be, but not limited to, box filtering, Gaussian filtering, non-local mean filtering, Steerable filter etc..

3. eliminate the Mismatching point generated in cost calculating process using disparity consistency detection, the method used can be with It include: firstly, disparity consistency detection can assume that the first disparity map is L, the second disparity map is R, on fisrt feature figure Pixel (x_r, y) with second feature figure on pixel (x_r, y) and it is a pair of of match point, due to carrying out three-dimensional in the embodiment of the present invention The image matched all is corrected image, therefore only exists horizontal parallax, and the pixel is in the first disparity map and the second parallax Parallax value in figure is respectively d_lr(x_l, y), d_rl(x_r, y)；Then, the mistake generated in cost calculating process is eliminated by following formula With point:

Wherein, x_r=x_l+d_lr(x_l, y), when the parallactic error of corresponding points in two width disparity maps meets | d_lr(x_l, y) and+d_rl (x_r, y) | when≤λ (λ is the parallactic error threshold value allowed), then show that corresponding points disparity correspondence is correct.When parallactic error is unsatisfactory for |d_lr(x_l, y) and+d_rl(x_r, y) | when≤λ, show that the point is Mismatching point.

4. after determining Mismatching point, the first disparity map and the second disparity map are merged, initial parallax figure is obtained, To fill the parallax information of Mismatching point.

5. a pair initial parallax figure carries out processing optimization, binocular parallax figure is obtained.It is close in order to eliminate artis in 2D skeleton The phenomenon that background, needs to expand initial parallax figure.Expand by seeking the edge of initial parallax figure, and to the edge Exhibition, using mini-value filtering, so that parallax is grown fat, weakens artis in background then in conjunction with initial parallax figure and edge masks The influence at place can finally obtain the binocular parallax figure an of high quality.

Fig. 6 is please referred to, sub-step S1042 may include following sub-step:

Sub-step S10421 calculates pixel and disparity search range corresponding in second feature figure in fisrt feature figure The matching cost of interior each pixel, obtains the first disparity map, wherein the first disparity map is using fisrt feature figure as reference Second feature figure is carried out cost as target figure and calculates gained by figure.

In embodiments of the present invention, using Hamming distance calculate fisrt feature figure in pixel with it is right in second feature figure The matching cost of each pixel within the scope of the disparity search answered, obtains the first disparity map, and disparity search range can be and the A line or a column in the corresponding second feature figure of pixel in one characteristic pattern, Hamming distance is to compare two binary strings Between difference, for example, 110011000 and 000011000 Hamming distance is 2.

Sub-step S10422 calculates pixel and disparity search range corresponding in fisrt feature figure in second feature figure The matching cost of interior each pixel, obtains the second disparity map, wherein the second disparity map is using second feature figure as reference Fisrt feature figure is carried out cost as target figure and calculates gained by figure.

In embodiments of the present invention, using Hamming distance calculate second feature figure in pixel with it is right in fisrt feature figure The matching cost of each pixel within the scope of the disparity search answered, obtains the second disparity map, and disparity search range can be and the A line or a column in the corresponding fisrt feature figure of pixel in two characteristic patterns.

Sub-step S10423 eliminates the parallax noise in the first disparity map and the second disparity map using cost polymerization.

In embodiments of the present invention, there are parallaxes to make an uproar for the first disparity map and the second disparity map being calculated by cost Sound, it is therefore desirable to a pixel window is established around pixel so that the pixel window block of pixels that includes and block of pixels it Between be compared, using the restriction of the consecutive points between pixel window reach to cost calculate optimization, make an uproar to eliminate parallax Sound.Cost polymerization may be, but not limited to, box filtering, gaussian filtering, non-local mean filtering, Steerable filter etc..

Sub-step S10424 eliminates the Mismatching point generated in cost calculating process using disparity consistency detection.

Sub-step S10425 merges the first disparity map and the second disparity map, obtains initial parallax figure, filling mistake Parallax information with point.

Sub-step S10426 carries out processing optimization to initial parallax figure, obtains binocular parallax figure.

In embodiments of the present invention, it in order to eliminate the phenomenon that artis is close to background in 2D skeleton, needs to initial parallax Figure is expanded.It is extended by seeking the edge of initial parallax figure, and to the edge, then in conjunction with initial parallax figure and side Edge mask, so that parallax is grown fat, weakens influence of the artis at background, obtains binocular parallax figure using mini-value filtering.

Sub-step S1043 carries out triangulation to binocular parallax figure, obtains depth map.

In embodiments of the present invention, according to the relationship of depth value and its parallaxWherein, Z indicates depth value, B table Show the spacing between the first camera lens and the second camera lens, f is the focal length of photographic device 105, and D is parallax, then the feelings known to parallax Under condition, the depth value of each pixel is calculated, so that binocular parallax figure is converted to depth map.

2D skeleton line and depth map are combined by step S105, render 3D skeleton line.

In embodiments of the present invention, the 2D skeleton line and depth map that the first image of training obtains can be combined, wash with watercolours 3D skeleton line is dyed, the 2D skeleton line and depth map that the second image of training obtains can also be combined, render 3D skeleton Line.As an implementation, by by the body joint point coordinate of 2D skeleton line and corresponding binocular parallax figure or depth map knot Altogether, 3D skeleton line can be rendered using OpenGL.

In embodiments of the present invention, firstly, being combined 2D skeleton line and the matched thought of binocular solid to carry out 3D Skeleton line application, thus by traditional skeleton line application extension to 3D；Secondly as the requirement of binocular solid paired device compared with It is low, body feeling interaction trip is carried out so as to replace kinect etc. to need to spend the equipment of higher cost in a certain range Play；Finally, by using the mode of binocular ranging, skeleton line can be transplanted to the mobile phone for carrying binocular camera and embedded Equipment increases the application range of traditional skeleton line algorithm, simultaneously because there is skeleton line 3D information therefore may be implemented significantly Better than the body feeling interaction application of 2D skeleton line.

Second embodiment

Fig. 7 is please referred to, Fig. 7 shows the block diagram of 3D skeleton line construction device 200 provided in an embodiment of the present invention. 3D skeleton line construction device 200 includes that original image obtains module 201,2D skeleton line obtains module 202, image correction module 203, stereo matching module 204 and execution module 205.

Original image obtains module 201, for obtaining the original image of photographic device acquisition, wherein original image includes First image of the first camera lens acquisition and the second image of the second camera lens acquisition.

In embodiments of the present invention, original image, which obtains module 201, can be used for executing step S101.

2D skeleton line obtains module 202, for original image to be inputted trained convolutional neural networks in advance, obtains 2D Skeleton line.

In embodiments of the present invention, 2D skeleton line, which obtains module 202, can be used for executing step S102.

Fig. 8 is please referred to, Fig. 8 obtains the side of module 202 for 2D skeleton line in the 3D skeleton line construction device 200 shown in Fig. 7 Frame schematic diagram.It includes artis detection unit 2021 and artis taxon 2022 that 2D skeleton line, which obtains module 202,.

Artis detection unit 2021 passes through volume for original image to be inputted trained convolutional neural networks in advance Product neural network detects to obtain multiple artis.

In embodiments of the present invention, artis detection unit 2021 can be used for executing sub-step S1021.

Artis taxon 2022 obtains 2D bone for classifying using convolutional neural networks to multiple artis Stringing.

In embodiments of the present invention, artis taxon 2022 can be used for executing sub-step S1022.

Image correction module 203, for being corrected using nominal data to original image, to remove the abnormal of original image Become.

In embodiments of the present invention, image correction module 203 can be used for executing step S103.

Stereo matching module 204, for after correction the first image and the second image carry out binocular solid matching, obtain Depth map.

In embodiments of the present invention, stereo matching module 204 can be used for executing step S104.

Fig. 9 is please referred to, Fig. 9 is that the box of the 200 neutral body matching module 204 of 3D skeleton line construction device shown in Fig. 7 shows It is intended to.Stereo matching module 204 includes that feature describes unit 2041, binocular parallax figure obtaining unit 2042 and depth map acquisition list Member 2043.

Feature describes unit 2041, for after correction the first image and the second image carry out feature description respectively, obtain To fisrt feature figure and second feature figure.

In embodiments of the present invention, feature, which describes unit 2041, can be used for executing sub-step S1041.

Figure 10 is please referred to, Figure 10 describes the box signal of unit 2041 for feature in the stereo matching module 204 shown in Fig. 9 Figure.It includes the first binary string construction unit 20411, fisrt feature figure obtaining unit 20412, second that feature, which describes unit 2041, Binary string construction unit 20413 and second feature figure obtaining unit 20414.

First binary string construction unit 20411, for utilizing each pixel in the first image after correction and its neighbour Size relation between the pixel of domain constructs corresponding first binary string of each pixel.

In embodiments of the present invention, the first binary string construction unit 20411 can be used for executing sub-step S10411.

Fisrt feature figure obtaining unit 20412, for replacing the pixel value of corresponding pixel points with each first binary string, Obtain fisrt feature figure.

In embodiments of the present invention, fisrt feature figure obtaining unit 20412 can be used for executing sub-step S10412.

Second binary string construction unit 20413, for utilizing each pixel in the second image after correction and its neighbour Size relation between the pixel of domain constructs corresponding second binary string of each pixel.

In embodiments of the present invention, the second binary string construction unit 20413 can be used for executing sub-step S10413.

Second feature figure obtaining unit 20414, for replacing the pixel value of corresponding pixel points with each second binary string, Obtain second feature figure.

In embodiments of the present invention, second feature figure obtaining unit 20414 can be used for executing sub-step S10414.

Binocular parallax figure obtaining unit 2042, for fisrt feature figure and second feature figure to be passed through cost calculating, cost Polymerization, the post-processing of left and right consistency detection, parallax, obtain binocular parallax figure.

In embodiments of the present invention, binocular parallax figure obtaining unit 2042 can be used for executing sub-step S1042.

Please refer to Figure 11, Figure 11 is the side of binocular parallax figure obtaining unit 2042 in stereo matching module 204 shown in Fig. 9 Frame schematic diagram.Binocular parallax figure obtaining unit 2042 includes the first disparity map obtaining unit 20421, the second disparity map obtaining unit 20422, parallax noise canceling unit 20423, Mismatching point eliminate unit 20424, initial parallax figure obtaining unit 20425 and place Reason optimization unit 20426.

First disparity map obtaining unit 20421, for calculate the pixel in fisrt feature figure with it is right in second feature figure The matching cost of each pixel within the scope of the disparity search answered, obtains the first disparity map, wherein the first disparity map is by first Characteristic pattern is used as with reference to figure, and second feature figure is carried out cost as target figure and calculates gained.

In embodiments of the present invention, the first disparity map obtaining unit 20421 can be used for executing sub-step S10421.

Second disparity map obtaining unit 20422, for calculate the pixel in second feature figure with it is right in fisrt feature figure The matching cost of each pixel within the scope of the disparity search answered, obtains the second disparity map, wherein the second disparity map is by second Characteristic pattern is used as with reference to figure, and fisrt feature figure is carried out cost as target figure and calculates gained.

In embodiments of the present invention, the second disparity map obtaining unit 20422 can be used for executing sub-step S10422.

Parallax noise canceling unit 20423 eliminates the first disparity map and the second disparity map for using cost polymerization In parallax noise.

In embodiments of the present invention, parallax noise canceling unit 20423 can be used for executing sub-step S10423.

Mismatching point eliminates unit 20424, eliminates in cost calculating process and generates for being detected using disparity consistency Mismatching point.

In embodiments of the present invention, Mismatching point, which eliminates unit 20424, can be used for executing sub-step S10424.

Initial parallax figure obtaining unit 20425 obtains initial for merging the first disparity map and the second disparity map Disparity map fills the parallax information of Mismatching point.

In embodiments of the present invention, initial parallax figure obtaining unit 20425 can be used for executing sub-step S10425.

Processing optimization unit 20426 obtains binocular parallax figure for carrying out processing optimization to initial parallax figure.

In embodiments of the present invention, processing optimization unit 20426 can be used for executing sub-step S10426.

Depth map obtaining unit 2043 obtains depth map for carrying out triangulation to binocular parallax figure.

In embodiments of the present invention, depth map obtaining unit 2043 can be used for executing sub-step S1043.

Execution module 205 renders 3D skeleton line for 2D skeleton line and depth map to be combined.

In embodiments of the present invention, execution module 205 can be used for executing step S105.

In conclusion a kind of 3D skeleton line construction method provided by the invention and device, applied to being provided with photographic device Electronic equipment, photographic device include the first camera lens and the second camera lens, which comprises obtain photographic device acquisition it is original Image, wherein original image includes the first image of the first camera lens acquisition and the second image of the second camera lens acquisition；It will be former Beginning image inputs trained convolutional neural networks in advance, obtains 2D skeleton line；School is carried out to original image using nominal data Just, to remove the distortion of original image；To after correction the first image and the second image carry out binocular solid matching, obtain depth Figure；2D skeleton line and depth map are combined, 3D skeleton line is rendered.Compared with prior art, the present invention has following excellent Gesture: firstly, being combined 2D skeleton line and the matched thought of binocular solid to carry out 3D skeleton line application, thus by traditional Skeleton line application extension is to 3D；Secondly as the requirement of binocular solid paired device is lower, so as in certain range It is interior to need to spend the equipment of higher cost instead of kinect etc. to carry out body feeling interaction game；Finally, by using binocular Skeleton line can be transplanted to the mobile phone and embedded device for carrying binocular camera, increase traditional skeleton line by the mode matched The application range of algorithm, simultaneously because there is skeleton line 3D information the body feeling interaction for being significantly better than 2D skeleton line therefore may be implemented Using.

In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part of the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.

In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.

It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment for including a series of elements not only includes those elements, but also including Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should also be noted that similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and explained.

Claims

1. a kind of 3D skeleton line construction method, which is characterized in that applied to the electronic equipment for being provided with photographic device, the camera shooting Device includes the first camera lens and the second camera lens, which comprises

Obtain the original image of the photographic device acquisition, wherein the original image includes the first figure of the first camera lens acquisition Picture and the second image of the second camera lens acquisition；

The original image is inputted into trained convolutional neural networks in advance, obtains 2D skeleton line；

The original image is corrected using nominal data, to remove the distortion of the original image；

To after correction the first image and the second image carry out binocular solid matching, obtain depth map；

The 2D skeleton line and the depth map are combined, 3D skeleton line is rendered.

2. the method as described in claim 1, which is characterized in that the first image and the second image after described pair of correction carry out double Mesh Stereo matching, the step of obtaining depth map, comprising:

To after correction the first image and the second image carry out feature description respectively, obtain fisrt feature figure and second feature figure；

Fisrt feature figure and second feature figure are post-processed by cost calculating, cost polymerization, left and right consistency detection, parallax, Obtain binocular parallax figure；

Triangulation is carried out to the binocular parallax figure, obtains depth map.

3. method according to claim 2, which is characterized in that described pair correction after the first image and the second image respectively into The step of row feature describes, and obtains fisrt feature figure and second feature figure, comprising:

Using the size relation between each pixel in the first image after correction and its neighborhood territory pixel point, each pixel is constructed Corresponding first binary string of point；

With the pixel value of each first binary string replacement corresponding pixel points, fisrt feature figure is obtained；

Using the size relation between each pixel in the second image after correction and its neighborhood territory pixel point, each pixel is constructed Corresponding second binary string of point；

With the pixel value of each second binary string replacement corresponding pixel points, second feature figure is obtained.

4. method according to claim 2, which is characterized in that described that fisrt feature figure and second feature figure are passed through cost meter The step of calculation, cost polymerization, left and right consistency detection, parallax post-process, obtain binocular parallax figure, comprising:

Calculate pixel in the fisrt feature figure and each picture within the scope of corresponding disparity search in the second feature figure The matching cost of vegetarian refreshments obtains the first disparity map, wherein first disparity map is to incite somebody to action using fisrt feature figure as with reference to figure Second feature figure carries out cost as target figure and calculates gained；

Calculate pixel in the second feature figure and each picture within the scope of corresponding disparity search in the fisrt feature figure The matching cost of vegetarian refreshments obtains the second disparity map, wherein second disparity map is to incite somebody to action using second feature figure as with reference to figure Fisrt feature figure carries out cost as target figure and calculates gained；

Using cost polymerization, the parallax noise in first disparity map and second disparity map is eliminated；

The Mismatching point generated in cost calculating process is eliminated using disparity consistency detection；

First disparity map and second disparity map are merged, initial parallax figure is obtained, fills the Mismatching point Parallax information；

Processing optimization is carried out to the initial parallax figure, obtains binocular parallax figure.

5. the method as described in claim 1, which is characterized in that described that original image is inputted trained convolutional Neural in advance Network, the step of obtaining 2D skeleton line, comprising:

The original image is inputted into trained convolutional neural networks in advance, detects to obtain by the convolutional neural networks more A artis；

Classified using the convolutional neural networks to multiple artis, obtains 2D skeleton line.

6. a kind of 3D skeleton line construction device, which is characterized in that applied to the electronic equipment for being provided with photographic device, the camera shooting Device includes the first camera lens and the second camera lens, and the 3D skeleton line construction device includes:

Original image obtains module, for obtaining the original image of photographic device acquisition, wherein the original image includes First image of the first camera lens acquisition and the second image of the second camera lens acquisition；

2D skeleton line obtains module, for the original image to be inputted trained convolutional neural networks in advance, obtains 2D bone Stringing；

Image correction module, for being corrected using nominal data to the original image, to remove the original image Distortion；

Stereo matching module, for after correction the first image and the second image carry out binocular solid matching, obtain depth map；

Execution module renders 3D skeleton line for the 2D skeleton line and the depth map to be combined.

7. device as claimed in claim 6, which is characterized in that the stereo matching module further include:

Feature describes unit, for after correction the first image and the second image carry out feature description respectively, obtain first spy Sign figure and second feature figure；

Binocular parallax figure obtaining unit, for calculating fisrt feature figure and second feature figure by cost, cost polymerize, left and right Consistency detection, parallax post-processing, obtain binocular parallax figure；

Depth map obtaining unit obtains depth map for carrying out triangulation to the binocular parallax figure.

8. device as claimed in claim 7, which is characterized in that the feature describes unit further include:

First binary string construction unit, for using each pixel and its neighborhood territory pixel point in the first image after correction it Between size relation, construct corresponding first binary string of each pixel；

Fisrt feature figure obtaining unit, for obtaining first with the pixel value of each first binary string replacement corresponding pixel points Characteristic pattern；

Second binary string construction unit, for using each pixel and its neighborhood territory pixel point in the second image after correction it Between size relation, construct corresponding second binary string of each pixel；

Second feature figure obtaining unit, for obtaining second with the pixel value of each second binary string replacement corresponding pixel points Characteristic pattern.

9. device as claimed in claim 7, which is characterized in that the binocular parallax figure obtaining unit further include:

First disparity map obtaining unit, it is corresponding with the second feature figure for calculating the pixel in the fisrt feature figure Disparity search within the scope of each pixel matching cost, obtain the first disparity map, wherein first disparity map is by the One characteristic pattern is used as with reference to figure, and second feature figure is carried out cost as target figure and calculates gained；

Second disparity map obtaining unit, it is corresponding with the fisrt feature figure for calculating the pixel in the second feature figure Disparity search within the scope of each pixel matching cost, obtain the second disparity map, wherein second disparity map is by the Two characteristic patterns are used as with reference to figure, and fisrt feature figure is carried out cost as target figure and calculates gained；

Parallax noise canceling unit eliminates first disparity map and second disparity map for using cost polymerization In parallax noise；

Mismatching point eliminates unit, eliminates the error hiding generated in cost calculating process for detecting using disparity consistency Point；

Initial parallax figure obtaining unit obtains initial for merging first disparity map and second disparity map Disparity map fills the parallax information of the Mismatching point；

Processing optimization unit obtains binocular parallax figure for carrying out processing optimization to the initial parallax figure.

10. device as claimed in claim 6, which is characterized in that the 2D skeleton line obtains module and includes:

Artis detection unit passes through the volume for the original image to be inputted trained convolutional neural networks in advance Product neural network detects to obtain multiple artis；

Artis taxon obtains 2D skeleton line for classifying using the convolutional neural networks to multiple artis.