CN103942795A - Structural synthesis method of image object - Google Patents

Structural synthesis method of image object Download PDF

Info

Publication number
CN103942795A
CN103942795A CN201410163775.0A CN201410163775A CN103942795A CN 103942795 A CN103942795 A CN 103942795A CN 201410163775 A CN201410163775 A CN 201410163775A CN 103942795 A CN103942795 A CN 103942795A
Authority
CN
China
Prior art keywords
image
dimensional
parts
agency
image object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410163775.0A
Other languages
Chinese (zh)
Other versions
CN103942795B (en
Inventor
周昆
许威威
陈翔
杨世杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201410163775.0A priority Critical patent/CN103942795B/en
Publication of CN103942795A publication Critical patent/CN103942795A/en
Application granted granted Critical
Publication of CN103942795B publication Critical patent/CN103942795B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Processing Or Creating Images (AREA)

Abstract

The invention discloses a structural synthesis method of an image object. The method comprises the following steps of marking camera parameters according to simple interaction of a user and generating a structural three-dimensional agent by combining the camera parameters and image object segmentation information, utilizing the three-dimensional agent and contact point information for enabling image components of different sources and different viewpoints to be combined and connected to form a novel image object, performing adjustment through intelligent colors to obtain a result image, based on the consistently-segmented image components, conducting statistical learning on an image data set of a specific object type, obtaining a probability image model, sampling the component types and styles to obtain a high-probability formed scheme and viewpoint property by performing probabilistic reasoning on a learned Bayes image model, and using a viewpoint sensing image object synthesis method for generating a result image. According to the method, a large number of novel image objects with high shooting quality and rich structural shape changes can be synthesized, and meanwhile the good basis and guidance can be provided for three-dimensional shape modeling.

Description

A kind of structuring synthetic method of image object
Technical field
The present invention relates generally to field of digital media, relates in particular to the applications such as image creation/editor, industry/Art Design, dummy object/role establishment, three-dimensional modeling.
Background technology
The technical background that the present invention is relevant is summarized as follows:
One, image is synthetic and comprehensive
Synthetic and the comprehensive fundamental purpose of image is to create visually reasonable believable new images from multiple images source.
Synthetic for image, its focus is the picture material of selection to carry out seamless spliced novel mixed method conventionally.Early stage work comprises multiresolution spline technology (BURT, P.J., AND ADELSON, E.H.1983.A multiresolution spline with application to image mosaics.ACM Trans.Graph.2,4 (Oct.), 217 – 236.; OGDEN, J.M., ADELSON, E.H., BERGEN, J.R., AND BURT, P.J.1985.Pyramid-based computer graphics.RCA Engineer30,5,4 – 15.) and synthetic operation (PORTER, T., AND DUFF, T.1984.Compositing digital images.SIGGRAPH Comput.Graph.18,3 (Jan.), 253 – 259.).Since Poisson picture editting technology (P ' EREZ, P., GANGNET, M., AND BLAKE, A.2003.Poisson image editing.ACM Transactions on Graphics (TOG) 22,3,313 – 318.) occur after, the synthetic method (JIA of gradient field, J., SUN, J., TANG, C.-K., AND SHUM, H.-Y.2006.Drag-and-drop pasting.In ACM Transactions on Graphics (TOG), vol.25, ACM, 631 – 637.; FARBMAN, Z., HOFFER, G., LIPMAN, Y., COHEN-OR, D., AND LISCHINSKI, D.2009.Coordinates for instant image cloning.In ACM Transactions on Graphics (TOG), vol.28, ACM, 67.; TAO, M.W., JOHNSON, M.K., AND PARIS, S.2010.Error tolerant image compositing.In Computer Vision – ECCV2010.Springer, 31 – 44.; SUNKAVALLI, K., JOHNSON, M.K., MATUSIK, W., AND PFISTER, H.2010.Multi-scale image harmonization.ACM Transactions on Graphics (TOG) 29,4,125.; SZELISKI, R., UYTTENDAELE, M., AND STEEDLY, D.2011.Fast poisson blending using multi-splines.In Computational Photography (ICCP), 2011IEEE International Conference on, IEEE, 1 – 8.) between one's early years, become seamless spliced standard technique.Recently, the people such as Xue (XUE, S., AGARWALA, A., DORSEY, J., AND RUSHMEIER, H.2012.Understanding and improving the realism of image composites.ACM Transactions on Graphics (TOG) 31,4,84.) improved synthetic vision rationality by the outward appearance of adjusting synthetic body.
On the other hand, image synthesis (DIAKOPOULOS, N., ESSA, I., AND JAIN, R.2004.Content based image synthesis.In Image and Video Retrieval.Springer, 299 – 307.; JOHNSON, M., BROSTOW, G.J., SHOTTON, J., ARANDJELOVIC, O., KWATRA, V., AND CIPOLLA, R.2006.Semantic photo synthesis.In Computer Graphics Forum, vol.25, Wiley Online Library, 407 – 413.; LALONDE, J.-F., HOIEM, D., EFROS, A.A., ROTHER, C., WINN, J., AND CRIMINISI, A.2007.Photo clip art.In ACM Transactions on Graphics (TOG), vol.26, ACM, 3.) mainly pays close attention to selection and the arrangement of vision content.Wherein the representational work of a class is that image is pieced together, by multiple images synthetic image under certain constraint.The initiator of this class work is interactive digital montage technology (AGARWALA, A., DONTCHEVA, M., AGRAWALA, M., DRUCKER, S., COLBURN, A., CURLESS, B., SALESIN, D., AND COHEN, M.2004.Interactive digital photomontage.In ACM Transactions on Graphics (TOG), vol.23, ACM, 294 – 302.), emerge successively again afterwards many follow-up works, as numeral establishment (ROTHER, C., KUMAR, S., KOLMOGOROV, V., AND BLAKE, A.2005.Digital tapestry[automatic image synthesis] .In Computer Vision and Pattern Recognition, 2005.IEEE Computer Society Conference on, vol.1, IEEE, 589 – 596.), automatically piece (ROTHER together, C., BORDEAUX, L., HAMADI, Y., AND BLAKE, A.2006.Autocollage.In ACM Transactions on Graphics (TOG), vol.25, ACM, 847 – 852.), image is pieced (WANG together, J., QUAN, L., SUN, J., TANG, X., AND SHUM, H.-Y.2006.Picture collage.In Computer Vision and Pattern Recognition, 2006IEEE Computer Society Conference on, vol.1, IEEE, 347 – 354.), puzzle is pieced (GOFERMAN together, S., TAL, A., 711AND ZELNIK-MANOR, L.2010.Puzzle-like collage.In Computer Graphics Forum, vol.29, Wiley Online Library, 459 – 468.), Sketch2Photo (CHEN, T., CHENG, M.-M., TAN, P., SHAMIR, A., AND HU, S.-M.2009.Sketch2photo:Internet image montage.ACM Transactions on Graphics28, 5, 124:1 – 10.), PhotoSketcher (EITZ, M., RICHTER, R., HILDEBRAND, K., BOUBEKEUR, T., AND ALEXA, M.2011.Photosketcher:interactive sketch-based image synthesis.Computer Graphics and Applications, IEEE31, 6, 56 – 66.), Arcimboldo pieces (HUANG together, H., ZHANG, L., AND ZHANG, H.-C.2011.Arcimboldo-like collage using internet images.ACM Transactions on Graphics (TOG) 30, 6, 155.) and up-to-date ring-type packing piece (YU together, Z., LU, L., GUO, Y., FAN, R., LIU, M., AND WANG, W.2013.Content-aware photo collage using circle packing.IEEE Transactions on Visualization and Computer Graphics99, PrePrints.).
Above-mentioned most of image is synthetic has all implied a hypothesis with integration algorithm: synthetic content has identical viewpoint with source images, and therefore they do not process camera parameter information.Draw technology (LALONDE, J.-F., HOIEM in photo clipping, D., EFROS, A.A., ROTHER, C., WINN, J., AND CRIMINISI, A.2007.Photo clip art.In ACM Transactions on Graphics (TOG), vol.26, ACM, 3.) in, author attempt infer camera attitude by object height.But this method cannot be processed true three-dimension relation, be therefore difficult to carry out complicated rotational transform.In a nearest job, the people such as Zheng (ZHENG, Y., CHEN, X., CHENG, M.-M., ZHOU, K., HU, S.-M., AND MITRA, N.J.2012.Interactive images:cuboid proxies for smart image manipulation.ACM Trans.Graph.31,4 (July), 99:1 – 99:11.) image object is expressed as to three-dimensional rectangular parallelepiped agency, and explicitly optimize camera and geometric parameter.Method in the present invention is used three-dimensional agency to represent equally, but needs to process spatial relationship and the structure between more challenging non-rectangular parallelepiped parts.
Two, the three-dimensional model of data-driven is synthetic
The synthetic research interest that has recently attracted a large amount of graphics field of three-dimensional model of data-driven.The parts that its object is intended to by combining in a collection of input 3D shape synthesize a large amount of novelties and meet the 3D shape that input shape set inner structure retrains from moving.The three-dimensional modeling of data-driven proposes (FUNKHOUSER the earliest by people such as Funkhouser, T., KAZHDAN, M., SHILANE, P., MIN, P., KIEFER, W., TAL, A., RUSINKIEWICZ, S., AND DOBKIN, D.2004.Modeling by example.ACM Trans.Graph.23,3 (Aug.), 652 – 663.), the three-dimensional part storehouse that their sample modeling allows user search to cut apart, then interactively is assembled these parts and is formed new shape.In follow-up work, the sketch of some user's inputs carrys out search parts (SHIN, H., AND IGARASHI, T.2007.Magic canvas:interactive design of a3-d scene prototype from freehand sketches.In Graphics Interface, 63 – 70., LEE, J., AND FUNKHOUSER, T.A.2008.Sketch-based search and composition of3d models.In SBM, 97 – 104.), have allow user can be in the shape of a small group coupling interchange components (KREAVOY, V., JULIUS, D., AND SHEFFER, A.2007.Model composition from interchangeable components.In Proceedings of the15th Pacific Conference on Computer Graphics and Applications, IEEE Computer Society, Washington, DC, USA, PG ' 07, 129 – 138.).The people such as Chaudhuri (CHAUDHURI, S., AND KOLTUN, V.2010.Data-driven suggestions for creativity support in3d modeling.ACM Trans.Graph.29, 6 (Dec.), 183:1 – 183:10.) a kind of proposed data-driven method recommends suitable parts to the incomplete shape of design, and after designed a kind of shape and structure probability represent, can provide the parts that more mate in semantic and style and recommend (CHAUDHURI, S., KALOGERAKIS, E., GUIBAS, L., AND KOLTUN, V.2011.Probabilistic reasoning for assembly-based3d modeling.ACM Trans.Graph.30, 4 (July), 35:1 – 35:10.).The people such as Kalogerakis have continued the method for above-mentioned probability inference and have used it for the synthetic (KALOGERAKIS of complete shape, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 – 55:11.).
Summary of the invention
The object of the invention is to for the deficiencies in the prior art, a kind of structuring synthetic method of image object is provided.The method, on the basis of the image object of one group of given type and tool different visual angles, synthesizes rationally image object true to nature of vision by combination image parts.
The object of the invention is to be achieved through the following technical solutions: a kind of structuring synthetic method of image object, comprises the following steps:
(1) pre-service of image object data: use number or the network equipment to collect the image collection of particular type object, require object structures complete display, and use image to cut apart the consistent cut zone that obtains object building block with marking tool;
(2) the image object synthetic method of viewpoint perception: according to user's the simple mutual camera parameter of demarcating single image, and the three-dimensional of combining camera parameter and image object carve information generating structured agency, then utilize three-dimensional agency and contact point information that image component is connected into novel image object, finally obtain result images by intelligent color adjustment;
(3) training of Bayesian probability graph model and integrated approach: based on the consistent image component of cutting apart, on the image data set of certain objects type, carry out statistical learning, obtain the probability graph model that can express complicated dependence between shape style, object structures, component categories and camera parameter; And by carry out probability inference on the Bayesian graphical model of acquistion, unit type and style etc. are sampled and obtained composition proposal and the viewpoint attribute of high probability image object, finally synthesize result images by the method in step 2;
(4) derivation of the synthetic result of image object: the result images that step 2 and step 3 are obtained, comprise camera parameter and three-dimensional proxy data that step 2 obtains, derive and storage with general format.
The invention has the beneficial effects as follows: the parts aspect of image object being carried out to viewpoint perception is synthetic, the parts from different points of view image can be connected and synthesize the novel image object that viewpoint is consistent and structure is correct.Meanwhile, the present invention has proposed a kind of single view camera calibration method based on coordinate frame first, is applicable to the general single image camera calibration that there is no obvious or full geometry clue; The three-dimensional that has proposed a kind of structure perception is acted on behalf of construction method, and the rectangular parallelepiped agency who is applicable to image object component layer builds; A kind of image component structuring synthetic method of three-dimensional agency's guiding has been proposed; Propose to synthesize shape in enormous quantities and style and change based on given sample image object the application of abundant image object; Propose the Bayesian probability graph model of integrated image view information, be applicable to viewpoint, structure and the change of shape of token image collection of objects.Compare existing 3D shape synthetic technology, the method can make full use of conventional images data bulk huge, easily obtain, advantage that color appearance information is abundant, synthesize large measurer photographic quality and planform and change abundant new images object, meet the requirement of many picture editting's related application, good basis and guiding is provided can to 3D shape modeling simultaneously.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the viewpoint perceptual image object synthetic method in the present invention;
Fig. 2 is single view camera calibration and the three-dimensional schematic diagram building of acting on behalf of of structuring in the present invention, in figure, (a) be the user interactions schematic diagram of input picture and camera calibration, (b) for the initial three-dimensional of the each parts of object obtaining based on camera parameter is acted on behalf of schematic diagram, (c) for retrain the optimization schematic diagram that carries out three-dimensional agency based on object structures, be (d) result after structure optimization;
Fig. 3 is the key element of using in the image object building-up process in the present invention, in figure, (a) be three-dimensional agency and link slot thereof, (b) be the partition member of image object, (c) be the two-dimentional invocation point for image wrapping distortion in image boundary, (d) be the two-dimentional contact point connecting for image component;
Fig. 4 is the schematic diagram of the parts color optimizing process of image object in the present invention;
Fig. 5 carries out image object training and comprehensive main-process stream schematic diagram based on probability graph model in the present invention;
Fig. 6 is the synthetic result figure of structuring obtaining based on chair image collection in the present invention;
Fig. 7 is the synthetic result figure of structuring obtaining based on cup image collection in the present invention;
Fig. 8 is the synthetic result figure of structuring obtaining based on desk lamp image collection in the present invention;
Fig. 9 is the synthetic result figure of structuring obtaining based on toy airplane image collection in the present invention;
Figure 10 gathers the synthetic result figure of the structuring obtaining based on robot graphics in the present invention;
Figure 11 is the result figure that in the present invention, the novel degree based on experiment gained composograph object is carried out to user's evaluation;
Figure 12 carries out the directly synthetic result comparison diagram synthetic with viewpoint perceptual image object to chair in the present invention;
Figure 13 carries out the directly synthetic result comparison diagram synthetic with viewpoint perceptual image object to toy airplane in the present invention.
Embodiment
Core of the present invention is to carry out the structure perception of parts aspect and have abundant shape and the declinable image object synthetic method of wind based on image object set.Core methed of the present invention is mainly divided into following four parts: the pre-service of image object data, the image object of viewpoint perception are synthetic, the training of Bayesian probability graph model and derivation comprehensive, the synthetic result of image object.
1. the pre-service of image object data: use digital equipment or network to collect the image collection of a certain particular type object, require the modular construction complete display of object, and use image partition tools to obtain the region of each parts in image, perform semantic marker simultaneously.
Obtaining of 1.1 image object set
This method is applied to common digital picture.As input data, this method requires to collect the congener volume image of some, and convergent-divergent is cropped to unified size.Because the structuring that this method is a kind of parts aspect is synthesized, therefore this step requires the modular construction complete display of collected image object.
1.2 users are auxiliary mutual
Because content and the border of the each parts of objects in images exist complicated morphological feature, be difficult to robust ground and carry out identification automatically and cut apart, therefore this method relies on appropriate user interactions image object set is carried out to pre-service so that the carrying out of subsequent step.By adopting Lazy Snapping technology (LI, Y., SUN, J., TANG, C.-K., AND SHUM, H.-Y.2004.Lazy snapping.ACM Transactions on Graphics (ToG) 23, 3, 303 – 308.) cut apart whole object area, then adopt LabelMe technology (RUSSELL, B.C., TORRALBA, A., MURPHY, K.P., AND FREEMAN, W.T.2008.LabelMe:a database and web-based tool for image annotation.International journal of computer vision77, 1-3, 157 – 173.) cut apart and each component area of marking objects.For the image component being blocked, adopt PatchMatch technology (BARNES, C., SHECHTMAN, E., FINKELSTEIN, A., AND GOLDMAN, D.2009.PatchMatch:a randomized correspondence algorithm for structural image editing.ACM Transactions on Graphics (TOG) 28,3,24:1 – 11.) come the completion region that is blocked.
2. the image object of viewpoint perception is synthetic
This image object synthetic method is used of a sort image object set (as chair) as input, semi-automatically analyzes their structure and extracts camera parameter.Then act on behalf of presentation video object according to the three-dimensional rectangular parallelepiped of fabric matching.The parts of image object can connect and synthesize novelty and have an X-rayed correct complete image object under three-dimensional agency's guiding.
2.1 image objects represent
Based on the result of step 1.1 and 1.2, the structured representation of relation between the semantic parts of a token image object of structure.Each image object can be expressed as a figure G={V, E}, and wherein V is the node set in figure, E is the set on limit in figure.Each parts C iit is a node in V.As two node C iand C jwhile being connected, in E, there is limit e ij.Wherein, C i={ P i, S i, cl, B i, P ifor belonging to parts C iarea pixel point, S ifor its partitioning boundary, cl is the mass-tone (k=2) being extracted by k-means method in parts pixel, B ifor its corresponding three-dimensional rectangular parallelepiped agency (being obtained by subsequent step 2.2).This structured representation will frequently use in subsequent step.
2.2 image three-dimensional agencies' generation: according to user's the simple mutual camera parameter of demarcating every width image, and the three-dimensional of the parts carve information generating structured of combining camera parameter and image object agency.
2.2.1 the camera calibration based on coordinate axis system
This method is used a two-dimentional summit and three bivectors (image projection of three-dimensional coordinate system initial point and coordinate axis) as input, with the comparison of existing single view camera calibration method, the method is more suitable for general image (the geometry clue on object is less).
Camera projection matrix M 3 × 4can be expressed as:
M 3 × 4 = K [ R | t ] , K = f 0 u 0 f v 0 0 1 , t = { t x , t y , t z }
Wherein K is camera internal reference matrix, and u, v} is made as picture centre, and focal distance f is as variable, and R is for adopting the parameterized orthogonal matrix of Eulerian angle, and t is translation vector, totally 7 variable elements.
The image projection point P of the initial point of three-dimensional coordinate system and point (0,0,1) oand P upcan be expressed as homogeneous coordinates:
P o = o 1 o 2 1 , P up = z 1 z 2 1
And 3-D walls and floor { x, y, the subpoint { l of z} x, l y, l zcan be expressed as:
l x = l x 1 l x 2 1 , l y = l y 1 l y 2 1 , l z = l z 1 l z 2 1
According to projective geometry theory, can set up following system of equations:
After expansion, obtain following 7 equations:
(fc 1c 2-us 2)l x1+(fc 2s 1-vs 2)l x2-s 2=0
(fc 1s 2s 3-fc 3s 1+uc 2s 3)l y1+(fc 1c 3+fs 1s 2s 3+vc 2s 3)l y2+c 2s 3=0
(fs 1s 3+fc 1c 3s 2+uc 2c 3)l z1+(fc 3s 1s 2-fc 1s 3+vc 2c 3)l z2+c 2c 3=0
ft 1+ut 3-o 1t 3=0
ft 2+vt 3-o 2t 3=0
fs 1s 3+fc 1c 3s 2+uc 2c 3+ft 1+ut 3-z 1c 2c 3-z 1t 3=0
fc 3s 1s 2-fc 1s 3+vc 2c 3+ft 2+vt 3-z 2c 2c 3-z 2t 3=0
This method solves above system of equations by nonlinear optimization and obtains camera parameter.
2.2.2 the three-dimensional of structure perception is acted on behalf of matching
Camera projection matrix M based on obtaining in step 2.2.1, this method adopts interactive image technology (ZHENG, Y., CHEN, X., CHENG, M.-M., ZHOU, K., HU, S.-M., AND MITRA, N.J.2012.Interactive images:cuboid proxies for smart image manipulation.ACM Trans.Graph.31,4 (July), 99:1 – 99:11.) rectangular parallelepiped of initialization coordinate axis alignment.Owing to there is no structural relation, these independent initialized rectangular parallelepipeds are dispersed in space separately, therefore, this method is carried out global optimization to rebuild the structural relation between these parts, the target of optimizing is to make parts geometric relationship in capable of meeting requirements on three-dimensional space in meeting image boundary, and its energy equation is as follows:
E(B 1,B 2,...,B N)=E fitting+E unary+E pair
Wherein, Section 1 E fittingthe departure degree of punishment optimum results and initial rectangular parallelepiped, is expressed as the cumulative distance of rectangular parallelepiped and two-dimensional projection's point on the summit of initial rectangular parallelepiped after optimizing:
E fitting = Σ i N Σ k | | M v k - M v ‾ k | | 2
Wherein N is part count, v kwith respectively rectangular parallelepiped B after optimizing iwith initial rectangular parallelepiped summit, use normalized homogeneous coordinates to calculate herein.
Monobasic bound term E unarypunish the departure degree of the structural constraint on single agency.Wherein mainly comprise that { Globreflection, OnGround} guarantees the correct relation between parts and camera parameter in two kinds of structural constraints.Globreflection represents that a rectangular parallelepiped is about certain world coordinates plane reflection symmetry, and OnGround represents that rectangular parallelepiped must be positioned on ground.This is defined as:
the rectangular parallelepiped set of OnGround constraint.Dist is the distance function of point to plane, it is the summit on the minimum rectangular parallelepiped surface of z value.
Binary bound term E pairpunish the departure degree of the structural constraint between two agencies.Wherein mainly comprise that { Side} guarantees rectangular parallelepiped structural relation between any two for Symmetry, On in three kinds of structural constraints.Symmetry represents between two rectangular parallelepipeds about certain world coordinates plane reflection symmetry, and On and Side represent that respectively a rectangular parallelepiped is positioned on another rectangular parallelepiped or leans against side.This is defined as:
Wherein the rectangular parallelepiped set that meets reflective symmetry constraint between between two, and S pbe respectively meet between two between the rectangular parallelepiped set of On and Side constraint.Rf function calculates the mirror position of a point, c according to plane p irectangular parallelepiped B icentral point.Section 1 in above formula is guaranteed reflective symmetry constraint by the central point that requires how much agencies about plane reflection symmetry, and then the face central point of a rectangular parallelepiped of two punishment and the end face of another rectangular parallelepiped or the distance of side are guaranteed On and Side constraint.Bc ib ibottom center, tp jb jend face; Similarly, sc ib icenter, side and sp jb jside.
In this method, rectangular parallelepiped is coordinate axis alignment, and each rectangular parallelepiped only need be optimized 6 parameters, i.e. the yardstick of each rectangular parallelepiped and center { l x, l y, l z, c x, c y, c z.This method adopts nonlinear optimization method Levenberg-Marquardt (LOURAKIS, M., 2004.levmar:Levenberg-marquardt nonlinear least squares algorithms in C/C++.[web page] http://www.ics.forth.gr/~lourakis/levmar/.) minimize gross energy.
The parts of 2.3 agency's guiding are synthetic: after the three-dimensional agency of estimation camera parameter and structure, by the each three-dimensional agency of Pan and Zoom in three dimensions, then the agency based on after conversion carry out two dimensional image bending and bonding by selectively parts synthesize an independent image object.
2.3.1 the connection of parts
This method is used " link slot " (KALOGERAKIS, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 – 55:11.; SHEN, C.-H., FU, H., CHEN, K., AND HU, S.-M.2012.Structure recovery by part assembly.ACM Transactions on Graphics (TOG) 31,6,180.) carry out connector body component.After building three-dimensional agency, between interconnective parts, can generate a link slot pair, wherein each groove comprises: target component a) being connected with groove associated components; B) interconnective contact point between two parts; C) scale size of linking objective parts.This method adopts following tactful generating three-dimensional contact point: if two agencies are intersected, get intersection mid point; Otherwise, get small volume and act on behalf of the mid point of the nearest face of the larger agency of middle distance volume.Two dimension contact point is the point of interconnective parts close together (within given threshold value, this method adopts 5 pixels) in image boundary.
(1) connect three-dimensional agency
This method is optimized each B of agency under the constraint of acting on behalf of annexation iposition and size.C i, l iwith act on behalf of respectively B icenter, size and the Three-Dimensional contact point of link slot k.Definition conversion B iwith rigid body translation: wherein Λ i=diag (s i), s iand t ibe respectively the part of zooming and panning in rigid body translation.Act on B iconversion agency's size and the restriction of position that need to be attached thereto, therefore definition for in original training image with B ithe size of the target proxy connecting by groove k.This method definition contact energy function item E cas follows:
Wherein for the pair set of acting on behalf of matching by link slot, m iand m jfor link slot index in agency under separately of coupling.This energy function item can interconnect the contact point of coupling, and makes interconnective agency have the size matching.In addition, this method adopts two energy function item E that keep shape sand E tavoid producing in optimizing process excessive distortion:
E s = Σ i | | s i - [ 1,1,1 ] T | | 2 , E t = Σ i | | t i | | 2
This method obtains each B of agency by minimizing following energy function ioptimal transformation
T i * = arg min T i ω c E c + ω s E s + ω t E t
Wherein, this method operation parameter ω c=1, ω s=0.5 and ω t=0.1.
(2) bending two dimensional image parts
After connecting three-dimensional agency, this method is calculated and curved three dimensional is acted on behalf of corresponding two dimensional image parts.In the time that the three-dimensional of design of graphics picture in step 2.2 is acted on behalf of, this method can be at each B that acts on behalf of iimage boundary on uniform sampling n iindividual two-dimentional reference point and will project in agency's visible face and obtain corresponding three-dimensional reference point (point outside two-dimensional projection border is acted on behalf of in removal).N in all experiments of this method i=200.
This method obtains corresponding objective point by above-mentioned three-dimensional reference point by the optimal transformation calculating in (1) step, and then objective spot projection is obtained to two dimension target point to the two dimensional surface of synthetic scene
Because three-dimensional agency's limited viewpoint changes, this method adopts two-dimentional affined transformation to carry out image wrapping.Obtain optimum affine transformation matrix by the distance minimizing between two-dimentional reference point and the two dimension target point after bending
A i * = arg min A i Σ i | | A i · a ^ i , r - b ^ i , r | | 2
Then use each parts are carried out to image wrapping.
(3) connect two dimensional image parts
This method, after (2) step, can be used equally by the two-dimentional contact point in each link slot k of each parts i convert the current coordinate that obtains two-dimentional contact point.Then adopt breadth first search to move image component.Maximum parts can be used as benchmark and insert queue.Whenever eject parts from queue, all parts are therewith connected and not accessed parts can move according to the current two-dimentional contact point information in link slot, are then inserted successively queue.After all parts are all accessed, search i.e. termination, obtains the composograph object of two dimensional image parts after connecting.Wherein, in the time that parts i is moved to parts j, (link slot of its coupling is m iand m j), adopt and first make center and center-aligned, then will in be positioned at the strategy that the point outside the partitioning boundary of parts j constantly moves to borderline closest approach.
2.3.2 the optimization of color
On synthetic result, this method based on color-match degree model (O ' DONOVAN, P., AGARWALA, A., AND HERTZMANN, A.2011.Color compatibility from large datasets.ACM Transactions on Graphics (TOG) 30,4,63.) and data-driven palette carry out the optimization of color.After step 1, this method extracts 5 tone colour tables from each image of image data set by k-means, and then the palette that extracts 40 looks with k-means from all these colors is as data palette.In synthetic image, the tone in the main color of maximum part is endowed data palette (coordinating variance parameter σ) and generates new palette.: Outfit synthesis through automatic optimization.ACM Transactions on Graphics31,6,134:1 – 134:14.) in similar color optimisation strategy from new palette, select to have the color set of maximum color matching degree, and by color transfer method (REINHARD, E., ADHIKHMIN, M., GOOCH, B., AND SHIRLEY, P.2001.Color transfer between images.Computer Graphics and Applications, IEEE21,5,34 – 41.) give each parts by it.
3. the training of Bayesian probability graph model and comprehensive
In this step, first the image component based on unanimously cutting apart, on the image data set of certain objects type, carry out statistical learning, obtain the probability graph model that can express complicated dependence between shape style, object structures, component categories and camera parameter, then by carry out probability inference on the probability graph model of acquistion, unit type and style etc. are sampled and obtained composition proposal and the viewpoint attribute of high probability image object, finally utilize viewpoint perceptual image object synthetic method to generate all result images.
3.1 the training of Bayesian probability graph model
This method adopts the work (KALOGERAKIS that is similar to the people such as Kalogerakis, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph.31, 4 (July), 55:1 – 55:11.) in the probability graph model that uses to particular type image object set carry out modeling, implicit variable wherein represents general structure and the parts style of object, and observational variable represents component categories, geometric properties and the neighbouring relations between them.The feature of this method is the view information that the extra observational variable of introducing carrys out token image.
3.1.1 the expression of Bayesian probability graph model
Upper tabular has been lifted the stochastic variable in the probability graph model that this method adopts.Wherein V is viewpoint parameter, carries out Mean Shift cluster (radius is made as 0.2) obtain by the camera parameter obtaining in step 2.2.1, is expressed as the round values of classification index.Geometric properties vector C lthe three-dimensional rectangular parallelepiped agency's who comprises l classification parts size and the some distributed model proper vector (COOTES of image two-dimensional silhouette, T.F., TAYLOR, C.J., COOPER, 599D.H., GRAHAM, J., ET AL.1995.Active shape models-their training and application.Computer vision and image understanding61,1,38 – 59.).The calculating of some distributed model is carried out respectively according to different viewpoint classifications.Implicit variable R and S acquistion from training data.
Whole joint probability distribution can be decomposed into the product of a set condition probability:
3.1.2 the training of Bayesian probability graph model
After step 1 and 2, from the image of training data (data volume is K), extract proper vector set wherein O k={ V k, N k, C k, D k.This method is carried out the structure and parameter of learning probability figure by maximizing following likelihood ratio function J:
Wherein use the evaluation score (SCHWARZ of bayesian information criterion, G.1978.Estimating the dimension of a model.The annals of statistics6,2,461 – 464.) select optimum probability graph structure G (field of definition scope). the maximum a posteriori probability that is parameter in G is estimated (MAP), m θbe the number of independent parameter, and K is the number of data.This method is used greatest hope (EM) algorithm to calculate the corresponding parameter of maximum likelihood
Wherein P (θ | G) is the prior probability distribution of parameter θ.In the M of EM algorithm step, conditional probability table parameter (discrete random variable) R, V, S l, D lwith coditionally linear Gaussian distribution parameter (continuous random variable) C lthe people's such as computing method and Kalogerakis method (KALOGERAKIS, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 – 55:11.) in form consistent.And in E step, use the implicit variable R of following formula calculating and S at observational variable O kunder conditional probability:
P ( O k ) = Σ R , S l P ( R , S l , O k )
P ( R , S l | O k ) = P ( R , S l , O k ) P ( O k )
Wherein l is component categories mark.This method is calculated joint probability P (R, S according to following formula l, O k):
This method is searched the graph structure G of J maximum by the Greedy strategy that increases gradually the field of definition scope (the optional value of discrete variable) that implies variable R and S.
3.2 Bayesian probability graph models comprehensive
Image object comprehensively can be divided into three steps.First, be identified for comprehensive image component set.Then, these parts are connected and synthesize an independent image object.Finally, optimize the color of synthetic body.Wherein, rear two steps can be realized by the viewpoint perceptual image object synthetic technology in step 2.
3.2.1 component set is comprehensive
In mathematical meaning, different component set can be regarded as the different sampled points of probability model.Therefore, this method adopts depth-first search strategy to carry out the shape space of searching image object.From root node variable R, the each stochastic variable in searching route is given respectively its possible value.Work (KALOGERAKIS with people such as Kalogerakis, E., CHAUDHURI, S., KOLLER, D., AND KOLTUN, V.2012.A probabilistic model for component-based shape synthesis.ACM Trans.Graph.31,4 (July), 55:1 – 55:11.) in deterministic algorithm consistent, this method is less than certain threshold value for probability and (in realization, adopts 10 -10) variable-value situation carry out beta pruning.For guaranteeing to search for feasibility, continuous variable C lonly can be endowed the analog value that has occurred parts in training data.In the effective sampling points finally finding at search utility, variable C lvalue determined for the synthesis of component set.
4. the derivation of the synthetic result of image object: synthetic the image object of above-mentioned steps result is derived and storage with general format, can be used for other Digital Media series products and application.
The derivation of 4.1 results
The step 2 of this method and 3 can synthesize the view data of large quantities of new objects.For compatible mutually with the universal data format of industry, specifically composograph can be stored as to associated documents form and derive form as the net result of this method.On the other hand, step 2 has built the three-dimensional agency of parts to image object, the three-dimensional agency that training image object can be analyzed to gained exports as the file layout of clear readability together with its modular construction, cut zone, texture image, use for technology or the application of needs.
The application of 4.2 results
As general image representation form, the derivation result of this method can be applied in all existing picture editting/design systems.
Embodiment
Inventor has been equipped with Intel Core2Quad Q9400 central processing unit at one, has realized all embodiments of the present invention on the machine of 4GB internal memory and Win7 operating system.Inventor adopts all parameter values of listing in embodiment, has obtained all experimental results shown in accompanying drawing.
Figure 12 and Figure 13 have shown directly synthetic comparison of synthesizing gained image object with this method.Wherein directly synthetic result tends to produce obvious distortion and unnaturally, and viewpoint perception synthetic method in the present invention can generate visually more reasonable novel image object true to nature.
The synthetic result (Fig. 6 to Figure 10) that the integration algorithm that inventor has invited some users to test Bayesian probability graph model in this method generates.Evaluation result shows, compared with object in original image set, user thinks synthetic image object completely rationally visually, and is novel object really.Wherein, altogether 48% selection think synthetic body be vision reasonably (with select training data 52% compared with there is no notable difference); And for the synthetic novelty (seeing Figure 11) of each classification, the highest 90% user thinks that the result of collection of bots is new object, minimum 79% user thinks that the result of desk lamp set is new object.
On the training time of Bayesian probability graph model, chair set (42 sub-pictures, 6 component categories, totally 243 parts) roughly need 20 minutes, cup set (22 sub-pictures, 3 component categories, totally 44 parts) roughly need 5 minutes, desk lamp set (30 sub-pictures, 3 component categories, totally 90 parts) roughly need 12 minutes, collection of bots (23 sub-pictures, 5 component categories, totally 130 parts) roughly need 15 minutes, toy airplane set (15 sub-pictures, 4 component categories, totally 63 parts) roughly need 3 minutes.On the generated time of image object, listing synthetic required component set does not on average need 20 seconds to 1 minute not etc.; And the connection procedure of parts on average needs 4 seconds in every sub-picture, color optimization on average needs 1 second.

Claims (3)

1. a structuring synthetic method for image object, is characterized in that, comprises the steps:
(1) pre-service of image object data: use number or the network equipment to collect the image collection of particular type object, require object structures complete display, and use image to cut apart the consistent cut zone that obtains object building block with marking tool;
(2) the image object synthetic method of viewpoint perception: according to user's the simple mutual camera parameter of demarcating single image, and the three-dimensional of combining camera parameter and image object carve information generating structured agency, then utilize three-dimensional agency and contact point information that image component is connected into novel image object, finally obtain result images by intelligent color adjustment;
(3) training of Bayesian probability graph model and integrated approach: based on the consistent image component of cutting apart, on the image data set of certain objects type, carry out statistical learning, obtain the probability graph model that can express complicated dependence between shape style, object structures, component categories and camera parameter; And by carry out probability inference on the Bayesian graphical model of acquistion, unit type and style etc. are sampled and obtained composition proposal and the viewpoint attribute of high probability image object, finally synthesize result images by the method in step 2;
(4) derivation of the synthetic result of image object: the result images that step 2 and step 3 are obtained, comprise camera parameter and three-dimensional proxy data that step 2 obtains, derive and storage with general format.
2. the structuring synthetic method of image object according to claim 1, is characterized in that, described step 2 comprises following sub-step:
(2.1) user is auxiliary mutual, on every width image of input, coordinate center and the change in coordinate axis direction of world coordinate system is demarcated;
(2.2) be optimized based on customer interaction information the camera parameter that calculates input picture;
(2.3) the image object carve information based on obtaining in step 1, the camera parameter obtaining in integrating step 2.2, to the initial three-dimensional agency of each component computes of image object;
(2.4) the initial three-dimensional agency based on the each parts of the required object of step 2.3, the one-piece construction constraint information between the each parts of binding object is optimized to generate final three-dimensional agency;
(2.5) assembled scheme and the view information based on input, links together the three-dimensional agency of each building block by the Three-Dimensional contact dot information defining in its link slot;
(2.6) agency of the three-dimensional based on connecting, generates the image of component under current view point by the primitive part image in agency by two-dimentional affined transformation, and wherein affined transformation is calculated and obtained under three-dimensional agency's constraint;
(2.7) in image space, seamless link is together by all parts image for the two-dimentional contact point information based on defining in parts link slot;
(2.8) color combination information and the optimization of existing color harmony degree evaluation model based on carrying out source images calculates the final mass-tone of the each parts of object, and by the color of the each parts of color transfer method conversion object, obtains the result images of new object.
3. the structuring synthetic method of image object according to claim 2, is characterized in that, described step 3 comprises following sub-step:
(3.1) set up the structural information of image object based on the consistent segmentation result obtaining in step 1, comprise the attributes such as the type, number, shape facility, interconnected relationship of each parts, wherein shape facility is described by the some distributed model coordinate of two-dimensional silhouette;
(3.2) carry out based on the camera parameter obtaining in claim 2 discrete representation (index value of viewpoint classification) that cluster obtains image viewpoint;
(3.3) view information obtaining in the object structures information based on obtaining in step 3.1 and step 3.2, on the image data set of whole particular type object, utilize a Bayesian probability graph model of greatest hope algorithm and bayesian information criterion training, characterize the complicated dependence of annexation between object structures, shape style, component categories and the parts of whole image data set space with this generation model;
(3.4) sample based on probability graph model, obtain composition proposal and the view information of new object, i.e. the source of each building block;
(3.5) a large amount of assembled schemes and the view information based on obtaining in step 3.4, utilizes the method in claim 2 to generate all result images.
CN201410163775.0A 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object Active CN103942795B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410163775.0A CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410163775.0A CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Publications (2)

Publication Number Publication Date
CN103942795A true CN103942795A (en) 2014-07-23
CN103942795B CN103942795B (en) 2016-08-24

Family

ID=51190446

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410163775.0A Active CN103942795B (en) 2014-04-22 2014-04-22 A kind of structuring synthetic method of image object

Country Status (1)

Country Link
CN (1) CN103942795B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376332A (en) * 2014-12-09 2015-02-25 深圳市捷顺科技实业股份有限公司 License plate recognition method and device
CN104850633A (en) * 2015-05-22 2015-08-19 中山大学 Three-dimensional model retrieval system and method based on parts division of hand-drawn draft
GB2568993A (en) * 2017-11-29 2019-06-05 Adobe Inc Generating 3D structures using genetic programming to satisfy functional and geometric constraints
CN111524589A (en) * 2020-04-14 2020-08-11 重庆大学 CDA (content-based discovery and analysis) shared document based health and medical big data quality control system and terminal
CN111886609A (en) * 2018-03-13 2020-11-03 丰田研究所股份有限公司 System and method for reducing data storage in machine learning

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7729531B2 (en) * 2006-09-19 2010-06-01 Microsoft Corporation Identifying repeated-structure elements in images
US9058647B2 (en) * 2012-01-16 2015-06-16 Canon Kabushiki Kaisha Information processing apparatus, information processing method, and storage medium
CN102800129B (en) * 2012-06-20 2015-09-30 浙江大学 A kind of scalp electroacupuncture based on single image and portrait edit methods

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376332A (en) * 2014-12-09 2015-02-25 深圳市捷顺科技实业股份有限公司 License plate recognition method and device
CN104850633A (en) * 2015-05-22 2015-08-19 中山大学 Three-dimensional model retrieval system and method based on parts division of hand-drawn draft
CN104850633B (en) * 2015-05-22 2018-10-12 中山大学 A kind of three-dimensional model searching system and method based on the segmentation of cartographical sketching component
GB2568993A (en) * 2017-11-29 2019-06-05 Adobe Inc Generating 3D structures using genetic programming to satisfy functional and geometric constraints
GB2568993B (en) * 2017-11-29 2021-05-19 Adobe Inc Generating 3D structures using genetic programming to satisfy functional and geometric constraints
CN111886609A (en) * 2018-03-13 2020-11-03 丰田研究所股份有限公司 System and method for reducing data storage in machine learning
CN111886609B (en) * 2018-03-13 2021-06-04 丰田研究所股份有限公司 System and method for reducing data storage in machine learning
CN111524589A (en) * 2020-04-14 2020-08-11 重庆大学 CDA (content-based discovery and analysis) shared document based health and medical big data quality control system and terminal
CN111524589B (en) * 2020-04-14 2021-04-30 重庆大学 CDA (content-based discovery and analysis) shared document based health and medical big data quality control system and terminal

Also Published As

Publication number Publication date
CN103942795B (en) 2016-08-24

Similar Documents

Publication Publication Date Title
Sahu et al. Artificial intelligence (AI) in augmented reality (AR)-assisted manufacturing applications: a review
Cheng et al. Intelligent visual media processing: When graphics meets vision
EP3188033B1 (en) Reconstructing a 3d modeled object
US9940749B2 (en) Method and system for generating three-dimensional garment model
US9792725B2 (en) Method for image and video virtual hairstyle modeling
Sun et al. Layered RGBD scene flow estimation
CN103942795B (en) A kind of structuring synthetic method of image object
Sivic et al. Creating and exploring a large photorealistic virtual space
Wu et al. Modeling and rendering of impossible figures
Turner et al. Sketching space
Onizuka et al. Landmark-guided deformation transfer of template facial expressions for automatic generation of avatar blendshapes
Wang et al. A survey of deep learning-based mesh processing
Li et al. Advances in 3d generation: A survey
US9639924B2 (en) Adding objects to digital photographs
Yin et al. [Retracted] Virtual Reconstruction Method of Regional 3D Image Based on Visual Transmission Effect
Kazmi et al. Efficient sketch‐based creation of detailed character models through data‐driven mesh deformations
Lin et al. Visual saliency and quality evaluation for 3D point clouds and meshes: An overview
Liao et al. Advances in 3D Generation: A Survey
Saran et al. Augmented annotations: Indoor dataset generation with augmented reality
Dhondse et al. Generative adversarial networks as an advancement in 2D to 3D reconstruction techniques
Ecormier‐Nocca et al. Image‐based authoring of herd animations
Shen et al. Neural Canvas: Supporting Scenic Design Prototyping by Integrating 3D Sketching and Generative AI
Chen et al. View-aware image object compositing and synthesis from multiple sources
Li et al. Stereoscopic image recoloring
Zhao et al. A Face Changing Animation Framework Based on Landmark Mapping

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant