CN104050712B

CN104050712B - The method for building up and device of threedimensional model

Info

Publication number: CN104050712B
Application number: CN201310083981.6A
Authority: CN
Inventors: 佟强; 李亮
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2013-03-15
Filing date: 2013-03-15
Publication date: 2018-06-05
Anticipated expiration: 2033-03-15
Also published as: CN104050712A

Abstract

The invention discloses the method for building up and device of a kind of threedimensional model, wherein, this method includes：For each picture frame received, the posture of target object in the picture frame is determined according to the depth information of target object in the picture frame；The position recovering of spatial point cloud corresponding to target object in the picture frame to target object is in position during reference attitude；Datum mark cloud when being in reference attitude of target object in reference frame is updated according to the spatial point cloud after reduction；Threedimensional model is established according to updated datum mark cloud.The present invention can effectively estimate the posture of target, reduce the difficulty of modeling by being estimated the targeted attitude in picture frame and spatial point cloud position being reduced；In addition, the present invention enables to the density at the spatial point cloud midpoint for modeling to be continuously increased, and then improves the accuracy of modeling, and it can aid in the cost for reducing modeling.

Description

The method for building up and device of threedimensional model

Technical field

The present invention relates to image processing field, and particularly, it is related to the method for building up and device of a kind of threedimensional model.

Background technology

For traditional reconstructing three-dimensional model system mainly comprising three kinds of technology paths, the main distinction is used data Collecting device.First way is based on acquisition two-dimensional image data, and X-Y scheme is fitted using tri patch or parametric surface As upper characteristics of image, the input data needed for this method is fairly simple and is easily obtained, but in expression precision and computing speed Ability on degree is limited；The second way is based on three-dimensional scanning device, and this method can generate accurate threedimensional model, But the price of the hardware device for gathering input data is very expensive, it is difficult to promote；The third is to be based on spatial point cloud, this Method can obtain input data by multiple means, and the model accuracy of its generation can be carried with the increase of input data Height, the technology path that the present invention uses is exactly this scheme.

In the head reconstructing three-dimensional model scheme based on spatial point cloud, difficult point is how head position is accurately positioned And estimate its attitude angle.Existing face detection techniques can detect that frontal faces or angle are less big in single video frame Side face, but when the larger even back side of the rotation angle on head, existing detection method can fail.For what is detected Face front, it will usually its attitude angle be estimated using the method that template matches or geometry are estimated, this method is limited to mould Plate or geometry are it is assumed that be difficult to ensure that its precision.In addition, the existing three-dimensional modeling accuracy based on spatial point cloud will directly be subject to The influence of video/image collecting device precision, if the precision of video/image collecting device is relatively low, the spatial point collected Point distribution in cloud will be more sparse so that there are large errors for the threedimensional model and the form of realistic objective finally established.

It is difficult to estimate that object gesture and precision are poor for the three-dimensional modeling scheme based on spatial point cloud in correlation technique The problem of, currently no effective solution has been proposed.

The content of the invention

It is difficult to estimate that object gesture and precision are poor for the three-dimensional modeling scheme based on spatial point cloud in correlation technique The problem of, the present invention proposes a kind of method for building up and device of threedimensional model, and the posture of target object effectively can be estimated Meter, and influence of the performance of image capture device to modeling accuracy can be reduced.

The technical proposal of the invention is realized in this way：

According to an aspect of the invention, there is provided a kind of method for building up of threedimensional model.

The method for building up of threedimensional model according to the present invention includes：

For each picture frame received, determined according to the depth information of target object in the picture frame in the picture frame The posture of target object；

The position recovering of spatial point cloud corresponding to target object in the picture frame to target object is in reference attitude When position；

Datum mark cloud when being in reference attitude of target object in reference frame is carried out according to the spatial point cloud after reduction Update；

Threedimensional model is established according to updated datum mark cloud.

According to another aspect of the present invention, provide a kind of threedimensional model establishes device.

The device of establishing of threedimensional model according to the present invention includes：

Determining module, it is true according to the depth information of target object in the picture frame for each picture frame received The posture of target object in the fixed picture frame；

Recovery module, for by the position recovering of the spatial point cloud corresponding to target object in the picture frame to target object Position during in reference attitude；

Update module, for according to the spatial point cloud after reduction in reference frame target object be in reference attitude when Datum mark cloud is updated；

Module is established, for establishing threedimensional model according to updated datum mark cloud.

The present invention is by estimating the targeted attitude in picture frame and spatial point cloud position being reduced, Neng Gouyou The posture of effect estimation target reduces the difficulty of modeling；In addition, the present invention by using the spatial point cloud in subsequent image frames to base Spatial point cloud in quasi- frame is updated, and the density at the spatial point cloud midpoint for modeling is enabled to be continuously increased, that is, is used In using image equipment performance it is not high, also can by obtaining the point cloud of more crypto set to the combination of multiple images frame point cloud, And then the accuracy of modeling is improved, and can aid in the cost for reducing modeling.

Description of the drawings

It in order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention Example, for those of ordinary skill in the art, without creative efforts, can also obtain according to these attached drawings Obtain other attached drawings.

Fig. 1 is the flow chart of the method for building up of the threedimensional model of the embodiment of the present invention；

Fig. 2 is the flow chart of the specific implementation process of the method for building up of the threedimensional model of the embodiment of the present invention；

Fig. 3 is the block diagram for establishing device of the threedimensional model of the embodiment of the present invention；

Fig. 4 is the exemplary block diagram for the computer for realizing technical solution of the present invention.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art's all other embodiments obtained belong to what the present invention protected Scope.

According to an embodiment of the invention, a kind of method for building up of threedimensional model is provided.

As shown in Figure 1, the method for building up of threedimensional model according to embodiments of the present invention includes：

Step S101 for each picture frame received, is determined according to the depth information of target object in the picture frame The posture of target object in the picture frame；Wherein, described posture herein can include target object compared with vision facilities The rotation angle of position (for example, camera site) and/or position；For example, in the case where target object is face, it should Posture can include the face angle, and/or face position (position here can include face height and position and/ Or horizontal level).

The position recovering of spatial point cloud corresponding to target object in the picture frame to target object is in by step S103 Reference attitude (wherein, reference attitude refers to that target object is in reference position, and compared with the rotation angle of shooting point on the basis of Posture during angle) when position.Specifically, because the interval of image frame sampling is shorter, in the single figure received In picture frame (wherein comprising depth information), region and the point cloud sector domain base of reference frame (current threedimensional model) where target object Can all have in sheet and overlapping (that is, all adopt some position including target object in reference frame and in current image frame, and deserving Point in preceding image corresponding to the position is also existed in reference frame, for no other reason than that the target object in the present frame compared to The target object in reference frame causes these points corresponding to the position there are attitudes vibration (including rotation and/or displacement) Position change, different from the position of respective point in reference frame), according to the target object region in current image frame Depth information, it becomes possible to determine rotation angle and translation distance of the target object in the picture frame compared with reference position, In this way, it is possible to the point cloud of target object region in present frame is rotated and shifted so that target object is deserving Corresponding a large amount of spatial discrete points, are converted into and are in benchmark when target object is in target object in reference frame in preceding picture frame The additional space position of each point during posture；

Step S105, according to the spatial point cloud after reduction in reference frame target object be in reference attitude when benchmark Point cloud is updated；

Step S107, establishing threedimensional model according to updated datum mark cloud, (establishing threedimensional model both includes establishing newly Threedimensional model is also updated former threedimensional model using current updated datum mark cloud).

It, can be by the spatial point cloud in subsequently received picture frame to base by means of the above-mentioned technical proposal of the present invention The spatial point cloud of quasi- frame is constantly updated, even if the poor-performing of video capture device, is merely able to collect sparse point cloud, But by this continuous progress Image Acquisition and the scheme of datum mark cloud is updated, originally more sparse point can be allowed to become more Crypto set so as to effectively increase the accuracy of modeling, is solved in many cases because equipment precision deficiency causes modeling accurate The problem of true property declines, avoids using expensive image capture device；Also, the present invention is gone back by the position to cloud Original can overcome the problems, such as to be difficult to set image capture device, even if image capture device can not be according to the appearance of target object State and change position, can be also reduced to reference attitude, so as to being updated to the picture frame of reference attitude, reduce and build The difficulty of mould.

Also, for each picture frame of reception, spatial point cloud in the picture frame is reduced corresponding to target object Before position, it can first judge whether the size of target object in the size of target object and reference frame in the picture frame is identical； It is poor according to the size of target object and target object in reference frame in the picture frame in the case where judging result is size difference It is different, the size of target object is zoomed in and out.

A cloud position recovering is carried out again by first carrying out size judgement, can effectively be avoided because target object is separate or lean on Nearly image capture device and the problem of the point cloud of current image frame and the point cloud of original picture frame is caused to have differences, correcting ruler Reduction is carried out after very little difference can further improve the accuracy of point cloud after update, and then improve the accuracy of modeling.

In addition, this method may further include：

Confidence level is preset to each point in datum mark cloud；

For each point in datum mark cloud, according in the point cloud of target object in the point and subsequently received picture frame The position difference between respective point after reduction adjusts the numerical value of the confidence level of the point.

Also, when establishing threedimensional model according to updated datum mark cloud, according to updated datum mark cloud midpoint Confidence level establishes threedimensional model.

Wherein, optionally, in when being in reference attitude of target object according to the spatial point cloud after reduction to reference frame Datum mark cloud when being updated, the numerical value of confidence level can be less than to the point deletion of predetermined value.

In addition, for each picture frame received, the target object according to the spatial point cloud after reduction to reference frame It, can be by the spatial point cloud of target object in the picture frame and target pair when datum mark cloud during in reference attitude is updated The datum mark cloud of elephant is combined, and obtains updated datum mark cloud.In addition, after the completion of being updated every time to datum mark cloud, It is contemplated that the corresponding threedimensional model of benchmark model is updated.

Citing description will be carried out to the adjustment of confidence level and the point newer process of cloud below.

For example, it is assumed that (value can be default initial value to the confidence value of some point P, can also for 5 in reference frame It is the end value before after multiple images framing control), it, can be by the spatial point cloud of frame A after the first frame A after acquisition It is combined with the spatial point cloud of reference frame, obtains new datum mark cloud, at this point, updated datum mark cloud not only includes benchmark Point P in frame further includes the point P in frame A_AIf point P in frame A_APosition and reference frame in point P distance be more than it is predetermined Distance, then it is assumed that more apparent variation occurs in target object, at this time can subtract the confidence value of point P in reference frame It is small, for example, becoming 4.It, can be by the spatial point cloud of the second frame and updated datum mark before after the second frame B after acquisition Cloud is combined, at this point, the spatial point cloud after combination includes point P, point P_AAnd point P accordingly in frame B_B.Such as fruit dot P_BWith point P Distance still greater than preset distance, and with frame A midpoint P_ADistance be less than preset distance, then can further reduce in reference frame The confidence value (for example, being reduced to 3 from 4) of point P, and the confidence value at frame A midpoints can increase.

And so on, it, then can be in the spatial point cloud (base of update reference frame when the confidence level of point P is reduced to below predetermined value Cloud on schedule) when point P is deleted from datum mark cloud；Alternatively, confidence level can not also be less than to the point deletion below predetermined value, and When establishing model without considering these points.

By means of said program, the posture or shape of target object briefly become a transient state from some normality, afterwards Normality is reverted to again, since the point of the spatial point cloud of target object that is gathered under normality is apart near one another, with transient state down space The point distance difference of point cloud is larger, and with the accumulation of time, under transient state, the point of change in location will be deleted, in modeling still It can be with reference to the point under normality.In addition, similarly, if target object becomes normality from a certain transient state, by considering that each point is put Reliability, also can be with the accumulation of frame number, and corresponding point deletion during by transient state is modeled based on the target object under normality.

For example, when of short duration variation occur in the expression and/or posture of people, can be effectively avoided using the above program because of this Kind variation reduces the problem of modeling accuracy.

In other embodiments, the numerical value of confidence level could be provided as other values, for example, could be provided as non-less than 1 Integer, correspondingly adjusting the step-length of the numerical value can also modify, and be not limited to specific example described above.

In addition, when being updated to reference frame, it can be when getting a picture frame every time, to reference frame more immediately Newly once, i.e. realize online updating.

In addition, in the related art, although also there is method to use head is made to be fixed on same position, and move depth data The mode of collecting device, obtains coordinate of the head each point in world coordinate system, and this method needs to adopt in head movement Collect equipment, be not easy to very much equipment being placed in fixed position, and during gathered data header information is required to remain unchanged, Also it is not convenient to use very much.

For the problem, the present invention proposes：For the first picture frame received, detected target object, determines wherein Position described in the first picture frame where target object；For each subsequent image frames received afterwards, region is utilized Tracking technique in current image frame to the target object of former frame into line trace, so that it is determined that target object institute in the picture frame Position.(that is, in some subsequent image frames, the target object can not be determined when this tracking failure by tracking Position), then again using target detection, so as to the detected target object in picture frame and the position of definite target object. In another alternative embodiment of the present invention, target detection can also be carried out to each picture frame, to determine target object The position at place.

Wherein, to target object into the position that multiple technologies determine target image during line trace, may be employed, for example, MeanShift algorithms (the Continuously of Kalman filtering (Kalman filter) and continuous adaptive may be employed Adaptive Mean-SHIFT, referred to as CamShift) tracking of target object is carried out, and above two method can be single It solely uses, can also be applied in combination.In addition, the present invention can equally use other tracking techniques, so as to solve Object tracking The problem of, it can be used in the technology of Object tracking for other, will not enumerate herein.

In this way, it may not need in a manner that complexity is difficult to realize to set image capture device (for example, image is allowed to adopt Collect equipment to move around target object), and image capture device need to be only arranged on to fixed position can just collect movement Target object is simultaneously modeled for target object, is reduced the difficulty for setting equipment, is facilitated the acquisition and modeling of image.

Optionally, each picture frame is gathered by video capture device, video capture device include deep video collecting device, And/or color video collecting device.In practical applications, in acquired image frames, one or more deep videos may be employed Collecting device can also use multiple color video collecting devices, furthermore it is also possible to which color video collecting device and depth are regarded Frequency collecting device is combined use, as long as the depth information of target object can be obtained.

In addition, for face modeling scene, in a preferred embodiment, above-mentioned target object can be people head or Face, and said reference posture can be in positive posture for target object.In other embodiments, above-mentioned target object May be other solid entities (for example, teapot, automobile etc.), such as Whole Body, and said reference posture can be the target The most apparent posture (corresponding object detection results precision highest) of characteristics of objects, such as positive Whole Body.In fact, phase For a certain specified image capture position (shooting point), when target object is in some posture, the image collected can be distinguished Gathered when target object is in other postures image (for example, when being in different postures by means of target object, features The difference of position position can distinguish the posture of target object), then, it is possible to using this posture as reference attitude.

It when face models, is often modeled for positive face, and then realizes the identification of face, therefore, led to It often can be using positive face as reference attitude.And in fact, can be equally different from for the image of the side face acquisition of people Face is in the image gathered under other postures, so the solution of the present invention equally can be by appearance on the basis of the side continuous cropping of people State carries out face modeling.

Similarly, for other solid objects, for example, when carrying out three-dimensional modeling to automobile, it equally can be by automobile just It, can also be by automobile side so as to which the front for automobile is modeled as reference attitude when face is in face of image capture device In face of image capture device or using other rotation angles face image capture device when posture as reference attitude.

In addition, any position that technical scheme can be directed to an object is modeled, as long as in picture frame It is middle to have selected to need the position that models as above-mentioned target object, technical scheme this position can just be carried out with Track, and determine position of the position in each picture frame, and update the reference frame at the position so that the spatial point at the position Cloud more crypto set effectively improves the accuracy of modeling.

It, can be by the figure for example, if user has selected the shoulder of the people occurred in a picture frame as target object Picture frame is as reference frame, using the posture (shoulder institute is angled) of shoulder in the reference frame as reference attitude, at this point, subsequently obtaining In each picture frame taken, hand will be tracked, and determine the position where hand

In addition, technical scheme can not only carry out the modeling that target object is in reference attitude, additionally it is possible to Establish 360 degree of models of target object.Exemplified by being modeled to face, the solution of the present invention can gather positive face simultaneously It is modeled for positive face, and if reference attitude is the side face of people, the side face of people can be gathered and built for side face Mould, similarly, the present invention can also be directed to the rear portion of head or other angles are modeled, and by putting cloud under various postures Acquisition, 360 degree of models of target object can be established.

In addition, even if prespecified just facing towards posture on the basis of the face of shooting point, due in acquired image frames, people Face can be rotated, and when collecting the side face of people, the partial dot of head part side and partial dot below can equally be present in In picture frame, technical scheme can be rotated the point cloud of head part in the picture frame (if the position of face at this time Put and change, can also carry out displacement) (that is, position when these points are changed under reference attitude), then by these points The spatial point cloud of face with being in reference frame under reference attitude is combined (referring to step S105), so as to in reference frame The spatial point cloud of face is updated.While aforesaid operations are carried out, the point of head part side is with after in current image frame The point in face can equally carry out rotation and displacement accordingly with face point, in the space with the spatial point cloud of present frame to reference frame After point cloud is updated, the point of front face is not only included in reference frame, also by the point including head part side and below Point.In this way, continuous movement and rotation with target object, it becomes possible to the point at each position of head part is collected, in this way, with regard to energy It is enough that the entire head of people is modeled, and with the increase of image frames numbers, the point cloud in reference frame is continuously updated, and The point of head part's other positions can also become more crypto set as the point of face, and head modeling also can be with online updating Process becomes more accurate.

Similarly, for other solid entities, equally 360 degree of space multistories can be carried out by technical scheme Modeling.

Below by exemplified by being modeled according to the head of people, technical scheme is described.

When realizing technical scheme, it is necessary to receive the image sequence comprising human body head；

To human body head into line trace to obtain the depth information of head front, side and the back side；

Depth information based on head calculates the spatial position of head point and establishes head threedimensional model.

Wherein it is possible to front face region initial position is obtained to be positioned to head zone, to determine Position where face.

Also, based on the picture frame obtained, can according to the spatial position of head point and current head threedimensional model, Estimate the posture direction (front, side, the back side) on head；According to the posture direction on head, recover the corresponding of head threedimensional model Region.

Also, when being modeled, can by the video sequence to human body head into line trace, to head model into Row online updating.

Fig. 2 shows the flow being modeled to the head of people.

As shown in Fig. 2, specifically include following steps：

S201, camera information acquisition are alignd with two field picture.

The camera intrinsic parameter of colour imagery shot and depth camera in acquisition system, cromogram is snapped to by depth image As upper, and the focal length for obtaining depth camera is denoted as f, wherein it is possible to which the coloured image after alignment is remembered respectively with depth image For I and D.If it should be noted that only with colour imagery shot or depth camera, this step need not be carried out.

S202 detects human face region initial position.

According to the coloured image and/or depth image of acquisition, the initial position of human face region is detected, as head three-dimensional mould The initialization data that type is rebuild, this detection can be based only on coloured image or be based only on depth image or be based on colour simultaneously Image and depth image.

Assuming that coloured image I and depth image D, detects human face region FACEⁱ, i is the call number for detecting face, choosing It selects maximum human face region and is denoted as F, i.e., the region in coloured image is IF, and the region in depth image is DF.

S203, human face region tracing detection.

The side of Face datection and target following may be employed in video streaming into line trace in human face region to detecting Method, this detection can be based only on coloured image with tracking or be based only on depth image or be based on coloured image and depth simultaneously Image.Region of the human face region traced into coloured image and depth image is expressed as F.

S204 calculates the spatial position of head point.

According to the depth value of each point on depth image, the position of each image pixel in space is calculated by perspective transform Coordinate, coordinate origin are located at the center of depth image.The image coordinate of a point P on depth image is made as (X, Y), Depth value is D, then the computational methods of the position coordinates (x, y, z) of its position p in space are as follows：

X=(X-w/2) * D/f；

Y=(h/2-Y) * D/f；

Z=D；

Wherein, w is the picture traverse of camera shooting, and h is picture altitude.

Due to the precision limitation of depth camera in itself, fathom value for having and calculate the point of spatial position, A confidence level conf is preserved on it simultaneously, for representing that the point is located at the degree of reliability of the spatial position.In the present embodiment In, conf=0.1 can be made.

S205, the posture direction of estimation head in three dimensions.

According to the spatial position of each head point in current video frame, using iterative closest point approach estimate one it is optimal Transformation matrix, the transformation matrix include the rotation angle and translation distance along tri- reference axis of X, Y, Z so that in current video frame Head point spatial position after the conversion, can be best bonded with the head model recovered.

Iterative closest point approach is not used to the scaling of estimation model.In the present embodiment, head is estimated first Scaling s (is estimated) for example, Active Shape Model may be employed, and calculates the anglec of rotation along three reference axis Degree and translation distance, the initial value as iteration.The spatial position of each head point in video frame is zoomed in and out according to s, so After reuse iterative closest point approach estimation transformation matrix T.

S206, reconstructing three-dimensional model.

In the present embodiment, a threedimensional model is represented using octree structure.According to the head estimated in three-dimensional Posture direction in space carries out conversion T, so as to obtain its correspondence to the spatial position p of each head point in current video frame Conversion after spatial position pT, the confidence level of the spatial point is identical with the confidence level of p, wherein, pT=T* (s*p).

The head threedimensional model only preserves the external model on head, and each video frame can only collect at most half head Image.For the spatial point pT (xT, yT, zT) after the conversion that calculates, correspond to the point p0 (x0, y0, z0) on model, Wherein x0=xT, y0=yT correct the space coordinates of p0.In entire head threedimensional model, there are identical x and y coordinates There are two (head forepart p0f and head rear portion p0b), their z coordinate differences altogether for point.In the present embodiment, head three is constructed During dimension module, using the central point on entire head as coordinate origin, i.e. z=0, therefore the z coordinate of p0f and p0b is positive and negative Symbol is opposite.According to the sign of zT, corresponding p0f or p0b are selected as p0 corresponding with pT.

According to the point p0 (x0, y0, z0) on point pT (xT, yT, zT) and confidence level confT and the model corresponding to it with Confidence level conf, to calculate updated spatial position pM (xM, yM, zM), computational methods are as follows：

XM=x0；

YM=y0；

ZM=(zT*confT+z0*conf)/(confT+conf)；

The computational methods of new confidence level are as follows：

After starting the above process, user need not be fixed on the action movement head before equipment or according to regulation, only need freedom Activity, system will rebuild the head threedimensional model of user from motion tracking and online.

The time appeared in user in equipment increases, and obtained threedimensional model is also more accurate.

If the head form of user changes, hair style has such as been replaced, which can correct threedimensional model with automatic on-line, Comply with the current head form of user.

In conclusion for problem present in correlation technique, the present invention provides a kind of based on spatial depth information Head threedimensional model dynamic reconstruction scheme.This method and system detects frontal faces in input video first, then automatically The head zone is tracked in three dimensions, calculates the spatial position of each head point, and estimates the attitude angle on head, further according to The attitude angle recovers the threedimensional model of head corresponding region.The present invention is in the premise without relying on expensive data acquisition equipment Under, the reconstruction precision of head threedimensional model is effectively improved, and collecting device can be mounted on fixed position, convenient for data Acquisition, and the dynamic reconstruction method provided can support the online updating of head threedimensional model；Also, the technology of the present invention Scheme can be applicable in for any object that its barment tag can be shown under one or more postures, as long as in target pair As that in rotation or motion process, the posture of object in current image frame can be restored to mesh according to the position where these features Mark object is in position during reference attitude；In addition, whether for face or other target objects, skill of the invention Art scheme can carry out 360 degree of spatial modelings.

Also, technical solution provided by the invention can be in the case of head pose, expression, profile variation, exactly Head threedimensional model is established, and by tracking the head zone in space, realizes the online updating of threedimensional model, and joint head Portion's Attitude estimation enhances the robustness to head variation, so as to fulfill the support to posture, expression, profile variation.It is similar Ground for other target objects, can equally carry out three-dimensional modeling, and can achieve the effect that it is similar, also, for it When his target object is modeled, even if variations in detail occurs in its appearance, technical scheme can equally support this A little variations ensure the accuracy of modeling.

According to an embodiment of the invention, additionally provide a kind of threedimensional model establishes device.

As shown in figure 3, the device of establishing of threedimensional model according to embodiments of the present invention includes：

Determining module 31, for each picture frame received, according to the depth information of target object in the picture frame Determine the posture of target object in the picture frame；

Recovery module 32, for by the position recovering of the spatial point cloud corresponding to target object in the picture frame to target pair Position during as being in reference attitude；

Update module 33, for according to the spatial point cloud after reduction in reference frame target object be in reference attitude when Datum mark cloud be updated；

Module 34 is established, for establishing threedimensional model according to updated datum mark cloud.

The apparatus according to the invention can further include：

Judgment module (not shown), for each picture frame to reception, target object institute is right in the picture frame is reduced Before the position for the spatial point cloud answered, judging the size of target object in the size of target object and reference frame in the picture frame is It is no identical；

Zoom module (not shown), in the case of in the judging result of judgment module for size difference, according to the figure As the size difference of target object and target object in reference frame in frame, the size of target object is zoomed in and out.

In addition, causing harmful effect to modeling result in order to avoid the transient state of target object, which can further wrap It includes：

Setup module (not shown), for presetting confidence level to each point in datum mark cloud；

Module (not shown) is adjusted, for each point in datum mark cloud, according to the point and subsequently received image The position difference between respective point after being reduced in frame in the point cloud of target object adjusts the numerical value of the confidence level of the point.

It, can putting according to updated datum mark cloud midpoint when establishing threedimensional model according to updated datum mark cloud Reliability establishes threedimensional model.

In addition, update module 33 is further used for the point deletion that the numerical value of confidence level is less than to predetermined value.

In addition, update module 33 can be used for the benchmark by the spatial point cloud of target object in the picture frame and target object Point cloud is combined, and obtains updated datum mark cloud.

In addition, above-mentioned determining module 31 is additionally operable to each picture frame to receiving, target object in the picture frame is determined The position at place；

In addition, the above device of the present invention further comprises detection module (not shown) and tracking module (not shown), In, detection module is used for when receiving first picture frame, and the detected target object in the first picture frame determines the first figure Position as where target object in frame；Tracking module is used in each subsequent image frames that first picture frame receives are later than Target object is tracked, position of the target object in each subsequent image frames is determined, is set to avoid by the way of more complicated Put image capture device；Wherein, for the subsequent image frames of the definite target object position of tracking, detection module can not be passed through Detected target object again is additionally operable in the subsequent image frames, determines the position of target object.

In another embodiment, above device of the invention can include detection module without including tracking module, so as to Detection module can carry out each picture frame the detection of target object, determine the position of each target object.

In addition, each picture frame is gathered by video capture device, video capture device include deep video collecting device, And/or color video collecting device.

Optionally, target object includes head or the face of people.In addition, the reference attitude of target object is at target object In positive posture.

In conclusion by means of technical scheme, by being estimated the targeted attitude in picture frame and right Spatial point cloud position is reduced, and can effectively estimate the posture of target, reduces the difficulty of modeling；In addition, the present invention passes through profit The spatial point cloud in reference frame is updated with the spatial point cloud in subsequent image frames, enables to the spatial point for modeling The density at cloud midpoint is continuously increased, that is, be used in it is not high using the equipment performance of image, also can be by multiple images frame point The combination of cloud obtains the point cloud of more crypto set, and then improves the accuracy of modeling, and can aid in the cost for reducing modeling； In addition, by means of continuous renewal of the present invention to spatial point cloud, it is multiple in the range of 360 degree that target object can be collected Point so as to help to realize the 360 of target object degree of spatial modelings, and constantly updates spatial point with the rotation of target object Cloud, the point at each position of target object can all become more crypto set, and then improve the accuracy of modeling.

The basic principle of the present invention is described above in association with specific embodiment, however, it is desirable to, it is noted that this field For those of ordinary skill, it is to be understood that the whole either any steps or component of methods and apparatus of the present invention, Ke Yi Any computing device (including processor, storage medium etc.) either in the network of computing device with hardware, firmware, software or Combination thereof is realized that this is that those of ordinary skill in the art use them in the case where having read the explanation of the present invention Basic programming skill can be achieved with.

Therefore, the purpose of the present invention can also by run on any computing device a program or batch processing come It realizes.The computing device can be well known fexible unit.Therefore, the purpose of the present invention can also be included only by offer The program product of the program code of the method or device is realized to realize.That is, such program product is also formed The present invention, and the storage medium for being stored with such program product also forms the present invention.Obviously, the storage medium can be Any well known storage medium or any storage medium developed in the future.

In the case where realizing the embodiment of the present invention by software and/or firmware, from storage medium or network to The computer of specialized hardware structure, such as the installation of all-purpose computer shown in Fig. 4 400 form the program of the software, the computer When being equipped with various programs, various functions etc. are able to carry out.

In Fig. 4, central processing module (CPU) 401 is according to the program stored in read-only memory (ROM) 402 or from depositing The program that storage part 408 is loaded into random access memory (RAM) 403 performs various processing.In RAM 403, also according to need Store the data required when various processing of CPU401 execution etc..CPU 401, ROM 402 and RAM 403 are via bus 404 are connected to each other.Input/output interface 405 is also connected to bus 404.

Components described below is connected to input/output interface 405：Importation 406, including keyboard, mouse etc.；Output par, c 407, including display, such as cathode-ray tube (CRT), liquid crystal display (LCD) etc. and loud speaker etc.；Store part 408, including hard disk etc.；With communications portion 409, including network interface card such as LAN card, modem etc..Communication unit 409 are divided to perform communication process via network such as internet.

As needed, driver 410 is also connected to input/output interface 405.Detachable media 411 such as disk, light Disk, magneto-optic disk, semiconductor memory etc. are installed on driver 410 as needed so that the computer journey read out Sequence is mounted to as needed in storage part 408.

It is such as removable from network such as internet or storage medium in the case where realizing above-mentioned series of processes by software Unload the program that the installation of medium 411 forms software.

It will be understood by those of skill in the art that this storage medium be not limited to it is shown in Fig. 4 wherein have program stored therein, Separately distribute the detachable media 411 for providing a user program with device.The example of detachable media 411 includes disk (including floppy disk (registered trademark)), CD (comprising compact disc read-only memory (CD-ROM) and digital versatile disc (DVD)), magneto-optic disk (including mini-disk (MD) (registered trademark)) and semiconductor memory.Alternatively, storage medium can be ROM402, storage part Hard disk included in 408 etc., wherein computer program stored, and user is distributed to together with the device comprising them.

It may also be noted that in apparatus and method of the present invention, it is clear that each component or each step are can to decompose And/or reconfigure.These decompose and/or reconfigure the equivalent scheme that should be regarded as the present invention.Also, perform above-mentioned series The step of processing, can perform in chronological order according to the order of explanation naturally, but and need not centainly sequentially in time It performs.Some steps can perform parallel or independently of one another.

Although the present invention and its advantage is described in detail it should be appreciated that do not departing from by appended claim Various changes, replacement and conversion can be carried out in the case of the spirit and scope of the present invention limited.Moreover, the art of the application Language " comprising ", "comprising" or any other variant thereof is intended to cover non-exclusive inclusion so that including it is a series of will Process, method, article or the device of element not only include those elements, but also including other elements that are not explicitly listed, It either further includes as this process, method, article or the intrinsic element of device.In the absence of more restrictions, The element limited by sentence " including one ... ", it is not excluded that in the process including the element, method, article or dress Also there are other identical elements in putting.

Claims

1. a kind of method for building up of threedimensional model, which is characterized in that including：

For each picture frame received, target in the picture frame is determined according to the depth information of target object in the picture frame The posture of object；

The position recovering of spatial point cloud corresponding to target object in the picture frame to the target object is in reference attitude When position；

According to the spatial point cloud after reduction to target object described in reference frame be in reference attitude when datum mark cloud It is updated；

Threedimensional model is established according to the updated datum mark cloud；

Wherein, the method for building up further comprises：

Confidence level is preset to each point in datum mark cloud；

For each point in datum mark cloud, according to the point with being reduced in the point cloud of target object in subsequently received picture frame The position difference between respective point afterwards adjusts the numerical value of the confidence level of the point.

2. method for building up according to claim 1, which is characterized in that further comprise：

For each picture frame of reception, before the position of the spatial point cloud in the picture frame is reduced corresponding to target object, Judge whether the size of target object is identical with the size of target object in the reference frame in the picture frame；

In the case where judging result is size difference, according to target object in target object in the picture frame and the reference frame Size difference, the size of the target object is zoomed in and out.

3. method for building up according to claim 1, which is characterized in that establishing three according to the updated datum mark cloud During dimension module, threedimensional model is established according to the confidence level at the updated datum mark cloud midpoint.

4. method for building up according to claim 3, which is characterized in that according to the spatial point cloud after reduction to reference frame Described in datum mark cloud when being in reference attitude of target object be updated and further comprise：

The numerical value of confidence level is less than to the point deletion of predetermined value.

5. method for building up according to claim 1, which is characterized in that for each picture frame received, according to reduction Datum mark cloud when spatial point cloud afterwards is in reference attitude to target object described in reference frame be updated including：

The spatial point cloud of target object in the picture frame and the datum mark cloud of the target object are combined, after obtaining update The datum mark cloud.

6. method for building up according to claim 1, which is characterized in that further comprise：

When receiving first picture frame, the detected target object in the first picture frame is determined described in the first picture frame Position where target object；

The each subsequent image frames received for being later than the first picture frame, track the target object, determine the mesh Position of the object in each subsequent image frames is marked, wherein, for that can not determine the target object position by tracking Subsequent image frames, detect the target object again in the subsequent image frames.

7. method for building up according to any one of claim 1 to 6, which is characterized in that each picture frame is by video Collecting device gathers, and the video capture device includes deep video collecting device, and/or color video collecting device.

8. method for building up according to any one of claim 1 to 6, which is characterized in that the target object includes people's Head or face.

9. method for building up according to any one of claim 1 to 6, which is characterized in that the reference attitude is target pair As being in positive posture.

10. a kind of threedimensional model establishes device, which is characterized in that including：

Determining module, for each picture frame received, this to be determined according to the depth information of target object in the picture frame The posture of target object in picture frame；

Recovery module, for by the position recovering of the spatial point cloud corresponding to target object in the picture frame to the target object Position during in reference attitude；

Update module, for being in reference attitude to target object described in reference frame according to the spatial point cloud after reduction When datum mark cloud be updated；

Module is established, for establishing threedimensional model according to the updated datum mark cloud；

Setup module, for presetting confidence level to each point in datum mark cloud；

Module is adjusted, for each point in datum mark cloud, according to target object in the point and subsequently received picture frame Point cloud in reduce after respective point between position difference, adjust the numerical value of the confidence level of the point.

11. according to claim 10 establish device, which is characterized in that further comprises：

Judgment module, for each picture frame to reception, the spatial point cloud in the picture frame is reduced corresponding to target object Position before, judge whether the size of target object identical with the size of target object in the reference frame in the picture frame；

Zoom module, in the case of in the judging result of the judgment module for size difference, according to mesh in the picture frame The size difference of object and target object in the reference frame is marked, the size of the target object is zoomed in and out.

12. according to claim 10 establish device, which is characterized in that is established according to the updated datum mark cloud It is described to establish module for establishing threedimensional model according to the confidence level at the updated datum mark cloud midpoint during threedimensional model.

13. according to claim 12 establish device, which is characterized in that the update module is further used for confidence level Numerical value be less than predetermined value point deletion.

14. according to claim 10 establish device, which is characterized in that the update module is used for mesh in the picture frame The spatial point cloud and the datum mark cloud of the target object for marking object are combined, and obtain the updated datum mark cloud.

15. according to claim 10 establish device, which is characterized in that further comprise detection module and tracking module, Wherein,

The detection module is used for when receiving first picture frame, detects the target object in the first picture frame, really Position described in the fixed first picture frame where target object；

The tracking module is used to track the target in each subsequent image frames that the first picture frame receives are later than Object determines position of the target object in each subsequent image frames；

Wherein, for can not by track determine the target object position subsequent image frames, the detection module is also For detecting the target object again in the subsequent image frames, the position of the target object is determined.

16. establish device according to any one of claim 10 to 15, which is characterized in that each picture frame is by regarding Frequency collecting device gathers, and the video capture device includes deep video collecting device, and/or color video collecting device.

17. establish device according to any one of claim 10 to 15, which is characterized in that the target object includes people Head or face.

18. establish device according to any one of claim 10 to 15, which is characterized in that the reference attitude is target Object is in positive posture.