CN116704128B

CN116704128B - Method and system for generating 3D model by single drawing based on deep learning

Info

Publication number: CN116704128B
Application number: CN202310707220.7A
Authority: CN
Inventors: 甘凌; 顾大桐; 王步国
Original assignee: Beijing Yuanyue Technology Co ltd
Current assignee: Beijing Yuanyue Technology Co ltd
Priority date: 2023-06-15
Filing date: 2023-06-15
Publication date: 2023-12-12
Anticipated expiration: 2043-06-15
Also published as: CN116704128A

Abstract

The invention relates to a method and a system for generating a 3D model by a single drawing based on deep learning, comprising the following steps: preprocessing a painting image, and extracting characteristics of the content of the painting image; optimizing the extracted features, and rendering the optimized features to generate a 2D image; acquiring the 2D images of the same type and a corresponding 3D model according to the generated 2D image type, and taking the 2D images and the corresponding 3D model as training data; constructing a 3D generation model, and training the 3D generation model through training data; and inputting the generated 2D image into a 3D generation model after training, and performing post-processing operation on the 3D model output by the 3D generation model. The method and the device improve the generation efficiency of the 3D model, can generate the corresponding 3D model according to any single drawing image, effectively expand the application scene of the 3D model and improve the experience of a user.

Description

Method and system for generating 3D model by single drawing based on deep learning

Technical Field

The invention relates to the technical field of drawing modeling, in particular to a method and a system for generating a 3D model by a single drawing based on deep learning.

Background

Deep learning is a machine learning method, training is performed by constructing a multi-layer neural network, and tasks such as classification, regression and the like are automatically extracted from data.

At present, a drawing generation algorithm has become one of research hotspots in the field of artificial intelligence. By utilizing the deep learning technology, high-quality and diversified drawing generation can be realized through learning and analysis of a large number of artwork samples, and the method has a wide application prospect.

In the related art, the method for generating the 3D model generally needs to manually model, consumes a lot of time, requires the generating personnel to have professional skills, and greatly reduces the generating efficiency of the 3D model.

Disclosure of Invention

In order to overcome the technical defects in the prior art, the invention provides a method and a system for generating a 3D model by using a single drawing based on deep learning, which can effectively solve the problems in the background art.

In order to solve the technical problems, the technical scheme provided by the invention is as follows:

in a first aspect, an embodiment of the present invention discloses a method for generating a 3D model from a single drawing based on deep learning, including the steps of:

preprocessing a painting image, and extracting characteristics of the content of the painting image;

optimizing the extracted features, and rendering the optimized features to generate a 2D image;

acquiring the 2D images of the same type and a corresponding 3D model according to the generated 2D image type, and taking the 2D images and the corresponding 3D model as training data;

constructing a 3D generation model, and training the 3D generation model through training data;

and inputting the generated 2D image into a 3D generation model after training, and performing post-processing operation on the 3D model output by the 3D generation model.

In any of the above schemes, preferably, the preprocessing of the painting image and the feature extraction of the content of the painting image include the following steps:

scaling the pictorial image and passing through the formula:removing noise points in the painting image, wherein (x, y) is the distance between a pixel and a center point, and sigma is the standard deviation of a Gaussian kernel;

by the formula:carrying out histogram equalization on the painting image, wherein s is an output pixel value, r is an input pixel value, T (r) is a pixel transformation function, L is the number of gray levels, M and N are the height and width of the image respectively, and h (q) is the number of pixels with the input pixel value of q;

by the formula: gray=0.299×r+0.587×g+0.114×b, converting the pictorial image into a gray image, wherein R, G and B are pixel values of three channels of red, green and blue, respectively, and gray is a gray value;

by the formula:calculating an optimal threshold k, wherein k is the optimal threshold in the formula, and w ₀ (t) and w ₁ (t) pixel duty ratio, m, of less than threshold k and greater than threshold k, respectively ₀ (t) and m ₁ (t) average gray values of pixels smaller than the threshold value k and equal to or larger than the threshold value k, respectively,/>And->Gray variance of pixels smaller than a threshold k and greater than or equal to the threshold k, respectively;

according to the optimal threshold k and by the formula:and carrying out binarization processing on the image to obtain the outline of the target object in the drawing image, wherein in the formula, k is an optimal threshold value, gray (i, j) is the gray value of the pixel (i, j), and binary (i, j) is the binarized pixel value of the pixel (i, j).

In any of the above schemes, preferably, the optimizing processing is performed on the extracted features, and the rendering is performed on the optimized features to generate a 2D image, which includes the following steps:

according to connectivity of adjacent pixels in the binarized image, merging discrete pixel points into a connected region, and merging the adjacent connected regions;

converting the combined regional outline into curve description to generate a planar geometric shape;

rendering the planar geometry generates a 2D image, and annotating the 2D image with information.

In any of the above solutions, preferably, the merging discrete pixel points into a connected region according to connectivity of adjacent pixels in the binarized image, and merging adjacent connected regions, includes:

determining the adjacent relation between pixels in an 8-communication mode, judging whether a pixel point is communicated with a neighbor pixel, merging the pixel point into a communication area to which the pixel point belongs if the pixel point is communicated, and forming a communication area independently if the pixel point is not communicated;

each connected region is numbered and marked, and the formula is adopted:calculating the similarity between adjacent regions, wherein in the formula, i ₁ ,j ₁ For numbering of regions, sim (i ₁ ,j ₁ ) For region i ₁ And j ₁ Similarity between sigma _F Sum sigma _P Similarity Gaussian radii, F, in feature space and position space, respectively _i1k And F _j1k Respectively the feature vectors F _i1 And F _j1 At the kth ₁ Values in the individual feature dimensions, (x _i1 ,y _i1 ) And (x) _j1 ,y _j1 ) Respectively are region i ₁ And j ₁ Is defined by a central position coordinate of (2);

and merging the adjacent areas by adopting algorithms such as a minimum spanning tree and the like until the adjacent areas can not be merged any more, so as to obtain a merged area.

In any of the above solutions, preferably, the converting the merged region contour into a curve description, generating a planar geometry includes:

selecting a starting point and an ending point from all the contour points, taking a connecting line between the starting point and the ending point as a straight line segment, and calculating the distance from all the edge points to the straight line segment;

dividing the straight line segment into two straight line segments at the point with the largest distance, and processing by using the same method respectively;

setting a threshold value and repeating the process until the distances of all straight line segments are smaller than the given threshold value, thereby obtaining the geometrical shape of the abstract polygon.

In any of the foregoing solutions, preferably, the rendering the planar geometry to generate a 2D image, and labeling the 2D image with information includes:

converting the geometric shape into a coordinate form of a pixel point through a DDA algorithm;

defining the row where each pixel point is located as a scanning line, and performing scanning treatment;

and filling the pixel points on each scanning line according to the width and color information of the line segment by using a Flood-Fill algorithm.

In any of the above aspects, it is preferable that the processing of the scan line includes:

finding all line segments intersecting the scan line;

performing interval operation on the line segments to obtain pixel point color values on the corresponding scanning lines; wherein the interval operation includes a union and an intersection.

In any of the above schemes, preferably, the building of the 3D generative model and the training of the 3D generative model by training data includes the following steps:

setting up a fully connected neural network, and constructing an encoder for mapping the input image to a low-dimensional vector representation in the potential space, and a decoder for mapping the low-dimensional vector back to object surface or point cloud data;

preprocessing training data;

inputting the preprocessed training data into a fully connected neural network, and minimizing an error between a 3D model prediction result and a real SDF function as a training target, wherein the training target is:in this case, < >>For the SDF value output by the fully connected neural network, c is the size of the grid used in SDF calculation, θ is the parameter of the fully connected neural network, R (θ) is the regularization term, λ is the regularization strength parameter, and N is the training sample (I _i ,SDF _i ) Number of (I) _i For training 2D images in the samples, the SDF is an SDF function of the corresponding 3D model surface.

In any of the above schemes, preferably, the encoder is a multi-layer perceptron, and the calculation formula of the g layer is: m is m _g+1 ＝σ _g (W _g m _g +b), in the formula, m _g For input of layer g, W _g As a weight matrix, b _g For bias, the activation function is sigma _g Output is m _g+1 ；

Setting the potential spatial vector of the input to be z, and the decoder to be SDF (p, z) =d (p, fz (p)), wherein in the formula, p is a point in 3D space, F _z For a function that the decoder maps z onto the surface, d (p, fz (p)) is the distance of point p to the nearest point on the surface.

In a second aspect, a system for generating a 3D model based on deep learning of a single drawing, the system comprising:

the extraction module is used for preprocessing the painting image and extracting the characteristics of the content of the painting image;

the generating module is used for optimizing the extracted features and rendering the optimized features to generate a 2D image;

the storage module is used for acquiring the 2D images of the same type and the corresponding 3D models according to the generated 2D image types and taking the 2D images and the corresponding 3D models as training data;

the building module is used for building a 3D generation model and training the 3D generation model through training data;

and the output module is used for inputting the generated 2D image into the trained 3D generation model and carrying out post-processing operation on the 3D model output by the 3D generation model.

Compared with the prior art, the invention has the beneficial effects that:

the method for generating the 3D model by the single drawing based on the deep learning solves the problems that the traditional 3D model generation method needs manual modeling and time and professional skills, and the method can quickly generate the corresponding 3D model by carrying out feature extraction and optimization on the drawing image;

the generated 2D image is ensured to be similar to the original painting image through feature extraction and optimization processing, so that the accuracy of generating a 3D model is ensured;

the corresponding 3D model can be generated according to any single drawing image, so that the application scene of the 3D model is effectively expanded, and the experience of a user is improved.

Drawings

The accompanying drawings are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification.

FIG. 1 is a flow chart of a method of generating a 3D model from a single drawing based on deep learning in accordance with the present invention;

FIG. 2 is a block diagram of a system for generating a 3D model based on a single drawing for deep learning in accordance with the present invention.

Detailed Description

The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

It will be understood that when an element is referred to as being "mounted" or "disposed" on another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element.

In the description of the present invention, it should be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate describing the present invention and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and therefore should not be construed as limiting the present invention.

Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present invention, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.

In order to better understand the above technical scheme, the following detailed description of the technical scheme of the present invention will be given with reference to the accompanying drawings of the specification and the specific embodiments.

As shown in fig. 1, the present invention provides a method for generating a 3D model from a single drawing based on deep learning, which includes the steps of:

step 1, preprocessing a painting image, and extracting characteristics of the content of the painting image;

step 2, optimizing the extracted features, and rendering the optimized features to generate a 2D image;

step 3, obtaining the 2D images of the same type and the corresponding 3D model according to the generated 2D image types, and taking the 2D images and the corresponding 3D models as training data;

step 4, constructing a 3D generation model, and training the 3D generation model through training data;

and 5, inputting the generated 2D image into a trained 3D generation model, and performing post-processing operation on the 3D model output by the 3D generation model.

Specifically, the step 1 is to preprocess the painting image and to extract the characteristics of the painting image content, and includes the following steps:

step 11, scaling the painting image, and passing through the formula:removing noise points in the painting image, wherein (x, y) is the distance between a pixel and a center point, and sigma is the standard deviation of a Gaussian kernel;

step 12, by the formula:carrying out histogram equalization on the painting image, wherein s is an output pixel value, r is an input pixel value, T (r) is a pixel transformation function, L is the number of gray levels, M and N are the height and width of the image respectively, and h (q) is the number of pixels with the input pixel value of q;

step 13, through the formula: gray=0.299×r+0.587×g+0.114×b, converting the pictorial image into a gray image, wherein R, G and B are pixel values of three channels of red, green and blue, respectively, and gray is a gray value;

step 14, by the formula:calculating an optimal threshold k, wherein k is the optimal threshold in the formula, and w ₀ (t) and w ₁ (t) pixel duty ratio, m, of less than threshold k and greater than threshold k, respectively ₀ (t) and m ₁ (t) average gray values of pixels smaller than the threshold value k and equal to or larger than the threshold value k, respectively,/>And->Gray variance of pixels smaller than a threshold k and greater than or equal to the threshold k, respectively;

step 15, according to the optimal threshold k and through the formula:binarizing the image to obtain a drawingIn the formula, k is an optimal threshold, gray (i, j) is a gray value of a pixel (i, j), and binary (i, j) is a binarized pixel value of the pixel (i, j).

Specifically, the step 2 of optimizing the extracted features and rendering the optimized features to generate a 2D image includes the following steps:

step 21, merging discrete pixel points into a connected region according to connectivity of adjacent pixels in the binarized image, and merging the adjacent connected regions;

step 22, converting the combined regional outline into curve description to generate a planar geometric shape;

and step 23, rendering the plane geometry to generate a 2D image, and labeling the 2D image with information.

Further, the step 21 of merging the discrete pixel points into a connected region according to connectivity of adjacent pixels in the binarized image, and merging the adjacent connected regions includes:

step 211, determining the adjacent relation between pixels in an 8-communication mode, judging whether a pixel point is communicated with a neighbor pixel, if so, merging the pixel point into a communication area to which the pixel point belongs, and if not, forming a communication area independently;

step 212, numbering and marking each connected area, and passing through the formula:calculating the similarity between adjacent regions, wherein in the formula, i ₁ ,j ₁ For numbering of regions, sim (i ₁ ,j ₁ ) For region i ₁ And j ₁ Similarity between sigma _F Sum sigma _P Similarity Gaussian radii, F, in feature space and position space, respectively _i1k And F _j1k Respectively the feature vectors F _i1 And F _j1 At the kth ₁ Values in the individual feature dimensions, (x _i1 ,y _i1 ) And (x) _j1 ,y _j1 ) Dividing intoAre not region i ₁ And j ₁ Is defined by a central position coordinate of (2);

and 213, merging the adjacent areas by adopting algorithms such as a minimum spanning tree and the like until the adjacent areas can not be merged again, and obtaining a merged area.

Further, in the step 22, the converting the merged region contour into a curve description to generate a planar geometry includes:

step 221, selecting a starting point and an ending point from all the contour points, taking the connecting line between the starting point and the ending point as a straight line segment, and calculating the distance from all the edge points to the straight line segment;

step 222, dividing the straight line segment into two straight line segments at the point with the largest distance, and processing the two straight line segments by using the same method respectively;

step 223, setting a threshold value and repeating the process until the distances of all the straight line segments are smaller than the given threshold value, thereby obtaining the geometrical shape after polygon abstraction.

Further, the step 23 of rendering the planar geometry to generate a 2D image and labeling the 2D image with information includes:

step 231, converting the geometric shape into a coordinate form of the pixel point through a DDA algorithm;

step 232, defining the row of each pixel point as a scanning line, and performing scanning treatment;

and 233, filling pixel points on each scanning line according to the width and color information of the line segment by using a Flood-Fill algorithm.

Still further, in step 232, the processing of the scan line includes:

finding all line segments intersecting the scan line;

Specifically, the step 4 of constructing a 3D generation model and training the 3D generation model through training data includes the following steps:

step 41, setting up a fully connected neural network, and constructing an encoder for mapping the input image to a low-dimensional vector representation in a potential space and a decoder for mapping the low-dimensional vector back to object surface or point cloud data;

step 42, preprocessing training data;

step 43, inputting the preprocessed training data into the fully connected neural network, and minimizing an error between the 3D model prediction result and the real SDF function as a training target, wherein the training target is:in this case, < >>For the SDF value output by the fully connected neural network, c is the size of the grid used in SDF calculation, θ is the parameter of the fully connected neural network, R (θ) is the regularization term, λ is the regularization strength parameter, and N is the training sample (I _i ,SDF _i ) Number of (I) _i For training 2D images in the samples, the SDF is an SDF function of the corresponding 3D model surface.

Further, the encoder is a multi-layer perceptron, and the calculation formula of the g layer is as follows: m is m _g+1 ＝σ _g (W _g m _g +b), in the formula, m _g For input of layer g, W _g As a weight matrix, b _g For bias, the activation function is sigma _g Output is m _g+1 ；

As shown in fig. 2, the present invention further provides a system for generating a 3D model from a single drawing based on deep learning, the system comprising:

Compared with the prior art, the invention has the beneficial effects that:

The above is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that the present invention is described in detail with reference to the foregoing embodiments, and modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for generating a 3D model by a single drawing based on deep learning is characterized by comprising the following steps: the method comprises the following steps:

preprocessing a painting image, extracting characteristics of the content of the painting image, and comprising the following steps:

according to the optimal threshold k and by the formula:performing binarization processing on the image to obtain the outline of the target object in the painting image, wherein in the formula, k is an optimal threshold value, gray (i, j) is the gray value of the pixel (i, j), and binary (i, j) is the binarized pixel value of the pixel (i, j);

optimizing the extracted features, and rendering the optimized features to generate a 2D image, comprising the following steps:

rendering the plane geometry to generate a 2D image, and labeling the 2D image with information;

2. The method for generating a 3D model based on deep learning individual drawings of claim 1, wherein: according to connectivity of adjacent pixels in the binarized image, merging discrete pixel points into a connected region, and merging the adjacent connected regions, including:

each connected region is numbered and marked, and the formula is adopted:calculating the similarity between adjacent regions, wherein in the formula, i ₁ ,j ₁ For numbering of regions, sim (i ₁ ,j ₁ ) For region i ₁ And j ₁ Similarity between sigma _F Sum sigma _P Similarity Gaussian radii in feature space and position space, respectively, +.>And->Respectively the feature vectors F _i1 And F _j1 At the kth ₁ Values in the individual feature dimensions, (x _i1 ,y _i1 ) And (x) _j1 ,y _j1 ) Respectively are region i ₁ And j ₁ Is defined by a central position coordinate of (2);

and merging the adjacent areas by adopting a minimum spanning tree algorithm until the adjacent areas can not be merged any more, so as to obtain a merged area.

3. The method for generating a 3D model based on deep learning individual drawings of claim 2, wherein: the converting the merged region outline into a curve description, generating a planar geometry, comprising:

4. A method of generating a 3D model based on deep learning individual drawings as claimed in claim 3, wherein: the rendering of the planar geometry to generate a 2D image and labeling the 2D image with information includes:

5. The method for generating a 3D model based on deep learning individual drawings of claim 4, wherein: the processing of the scanning line includes:

finding all line segments intersecting the scan line;

6. The method for generating a 3D model based on deep learning individual drawings of claim 5, wherein: the 3D generation model is constructed, and the 3D generation model is trained through training data, and the method comprises the following steps:

preprocessing training data;

7. The method for generating a 3D model based on deep learning individual drawings of claim 6, wherein: the encoder is a multi-layer perceptron, and the calculation formula of the g layer is as follows: m is m _g+1 ＝σ _g (W _g m _g +b), in the formula, m _g For input of layer g, W _g As a weight matrix, b _g For bias, the activation function is sigma _g Output is m _g+1 ；

8. A system for generating a 3D model from a single drawing based on deep learning, characterized in that: the system comprises:

the extraction module is used for preprocessing the painting image and extracting the characteristics of the content of the painting image, and comprises the following steps:

the generating module is used for optimizing the extracted features, rendering the optimized features to generate a 2D image, optimizing the extracted features, and rendering the optimized features to generate the 2D image, and comprises the following steps: