CN116109743B

CN116109743B - Digital person generation method and system based on AI and image synthesis technology

Info

Publication number: CN116109743B
Application number: CN202310375969.6A
Authority: CN
Inventors: 郑飞; 胡伟; 谢宇笙; 徐龙
Original assignee: Guangzhou Intelligent Computing Information Technology Co ltd
Current assignee: Guangzhou Intelligent Computing Information Technology Co ltd
Priority date: 2023-04-11
Filing date: 2023-04-11
Publication date: 2023-06-20
Anticipated expiration: 2043-04-11
Also published as: CN116109743A

Abstract

The invention belongs to the field of digital people, and discloses a digital person generation method and a digital person generation system based on AI and image synthesis technology, wherein the method comprises the following steps of S1, acquiring a real face imagePThe method comprises the steps of carrying out a first treatment on the surface of the S2, toPPreprocessing to obtain a preprocessed imageaPThe method comprises the steps of carrying out a first treatment on the surface of the S3, toaPAfter noise reduction processing is carried out, edge pixel points are obtained through edge detection, a first image oneP is formed by the edge pixel points and adjacent pixel points, a second image twoP is formed by the rest pixel points, noise reduction results of the oneP and the twoP are combined, and a noise reduction image is obtainedbP: s4, tobPPerforming three-dimensional reconstruction to obtain a three-dimensional face model; s5, generating a digital person based on the three-dimensional face model. The invention can make the generated face of the digital person more similar to the face in the real face image.

Description

Digital person generation method and system based on AI and image synthesis technology

Technical Field

The invention relates to the field of digital persons, in particular to a digital person generation method and system based on AI and image synthesis technology.

Background

The digital person refers to a digital character image which is created by digital technology and is close to the human image. The digital person is a product of information science and life science fusion, and is a result of virtual simulation on the forms and functions of the human body at different levels by using an information science method. In order to make the facial expression of the digital person more similar to that of the real person, in the prior art, three-dimensional modeling is generally performed based on the real face image to obtain a three-dimensional face, and then bone information and map information in the three-dimensional face are applied to the face of the digital person, so that the digital person based on the face similar to the real face is obtained.

In the prior art, before three-dimensional modeling is performed on a face image, noise reduction processing is required to be performed on the face image, and the influence of noise points on the accuracy of the three-dimensional modeling is reduced, so that the similarity between the face of a generated digital person and the face in a real face image is improved. However, in the existing noise reduction processing method, edge pixel points are not distinguished in the noise reduction process, and because the edge pixel points and the noise pixel points have similar characteristics, the edge pixel points are easily treated as noise pixel points to perform noise reduction processing, so that the edge of a face becomes fuzzy, the accuracy of three-dimensional reconstruction is influenced, and the similarity between the face of a generated digital person and the face in a real face image is influenced.

Disclosure of Invention

The invention aims to disclose a digital person generating method and a digital person generating system based on an AI and image synthesis technology, which solve the problem that the similarity between the face of a generated digital person and the face in a real face image is affected by noise reduction treatment without distinguishing edge pixel points in the three-dimensional reconstruction process in the existing digital person generating process.

In order to achieve the above purpose, the invention adopts the following technical scheme:

in one aspect, the present invention provides a digital person generating method based on AI and image synthesis technology, comprising:

s1, acquiring a real face imageP；

S2, toPPreprocessing to obtain a preprocessed imageaP；

S3, adopting the following method foraPNoise reduction processing is carried out to obtain a noise reduction imagebP：

S31, pair ofaPEdge detection is carried out to obtain a set of edge pixel pointsoneSet；

S32, respectively acquiring pixel points in a noise reduction window corresponding to each edge pixel point;

s33, storing all pixel points in the noise reduction window into a settwoSet；

S34, willaPNot belonging to a collectiontwoSetPixel point storage set of (a)thrSet；

S35, by the collectiontwoSetThe pixels in the array form a first imageonePBy aggregation ofthrSetThe pixels in the array form a second imagetwoP；

S36, respectively toonePAndtwoPnoise reduction is carried out on the pixel points in the image to obtain a first image after noise reductionafOnePAnd a second image after noise reductionafTwoP；

S37, willafOnePAndafTwoPmerging to obtain a noise reduction imagebP；

S4, tobPPerforming three-dimensional reconstruction to obtain a three-dimensional face model;

s5, generating a digital person based on the three-dimensional face model.

On the other hand, the invention also provides a digital person generating system based on the AI and image synthesis technology, which comprises an acquisition module, a preprocessing module, a noise reduction module, a reconstruction module and a generation module;

the acquisition module is used for acquiring the real face imageP；

The preprocessing module is used for matchingPPreprocessing to obtain a preprocessed imageaP；

The noise reduction module is used for adopting the following modesaPNoise reduction processing is carried out to obtain a noise reduction imagebP：

For a pair ofaPEdge detection is carried out to obtain a set of edge pixel pointsoneSet；

Respectively acquiring pixel points in a noise reduction window corresponding to each edge pixel point;

storing all pixel points in the noise reduction window into a settwoSet；

Will beaPNot belonging to a collectiontwoSetPixel point storage set of (a)thrSet；

From a collectiontwoSetThe pixels in the array form a first imageonePBy aggregation ofthrSetThe pixels in the array form a second imagetwoP；

Respectively toonePAndtwoPnoise reduction is carried out on the pixel points in the image to obtain a first image after noise reductionafOnePAnd a second image after noise reductionafTwoP；

Will beafOnePAndafTwoPmerging to obtain a noise reduction imagebP；

The reconstruction module is used for matchingbPPerforming three-dimensional reconstruction to obtain a three-dimensional face model;

the generation module is used for generating the digital person based on the three-dimensional face model.

Compared with the existing noise reduction mode for the real face image in the digital human generation process, the method comprises the steps of firstly carrying out edge detection when noise reduction is carried out, then obtaining the pixel points of the noise reduction window corresponding to each edge pixel point, then forming a first image based on the pixel points of all the noise reduction windows, and simultaneously,aPthe pixel points except the pixel points in the first image form a second image, the first image and the second image are respectively subjected to noise reduction, and a final noise reduction image is obtained based on the noise reduction resultbP. Because the edge pixel points are distinguished from the common pixel points, different noise reduction modes are adopted to carry out noise reduction treatment, the final noise reduction result can effectively reserve the edge information in the real face image, and the accuracy of three-dimensional reconstruction is effectively improved, so that the face of the generated digital person is more similar to the face in the real face image.

Drawings

The invention will be further described with reference to the accompanying drawings, in which embodiments do not constitute any limitation of the invention, and other drawings can be obtained by one of ordinary skill in the art without inventive effort from the following drawings.

Fig. 1 is a diagram of an embodiment of a digital person generation method based on AI and image synthesis technology according to the present invention.

FIG. 2 is a schematic illustration of the pair of the present inventionPPreprocessing to obtain a preprocessed imageaPAn embodiment diagram of (a) is shown.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

In one aspect, as shown in an embodiment of fig. 1, the present invention provides a digital person generating method based on AI and image synthesis technology, including:

s1, acquiring a real face imageP；

S2, toPPreprocessing to obtain a preprocessed imageaP；

s33, storing all pixel points in the noise reduction window into a settwoSet；

S37, willafOnePAndafTwoPmerging to obtain a noise reduction imagebP；

s5, generating a digital person based on the three-dimensional face model.

Compared with the existing noise reduction mode for the real face image in the digital human generation process, the method comprises the steps of firstly carrying out edge detection when noise reduction is carried out, then obtaining the pixel points of the noise reduction window corresponding to each edge pixel point, then forming a first image based on the pixel points of all the noise reduction windows, and simultaneously,aPexcept for pixels in the first imageThe pixel points outside the points form a second image, the first image and the second image are respectively subjected to noise reduction processing, and a final noise reduction image is obtained based on the noise reduction processing resultbP. Because the edge pixel points are distinguished from the common pixel points, different noise reduction modes are adopted to carry out noise reduction treatment, the final noise reduction result can effectively reserve the edge information in the real face image, and the accuracy of three-dimensional reconstruction is effectively improved, so that the face of the generated digital person is more similar to the face in the real face image.

Preferably, as shown in fig. 2, the S2 includes:

s21, dividing P into a plurality of sub-blocks with the same size;

s22, respectively carrying out brightness adjustment processing on each sub-block to obtain sub-blocks with brightness adjusted;

s23, forming all sub-blocks with brightness adjusted into a preprocessing imageaP。

Specifically, brightness distribution in the real face image can be balanced by performing brightness adjustment processing, so that influence on the accuracy of three-dimensional reconstruction due to brightness non-uniformity is avoided.

Preferably, for sub-blockssonBkThe following method is adopted forsonBkAnd (3) performing brightness adjustment processing:

acquisition ofsonBkImage corresponding to luminance component in Lab color spaceimgLig；

Acquisition ofsonBkCorresponding gray scale imageimgGra；

Acquisition based on retinex modelimgGraCorresponding illumination imageimgLi；

Will beimgLigAnd (3) withimgLiPerforming subtraction of the image to obtainimgLigCorresponding reflected imageimgDc；

Will beimgGraConversion to mean imageimgAve；

Will beimgAveAndimgDcperforming addition operation on the image to obtain the imageimgRt；

Will beimgRtConverting back to RGB color space to obtain brightness adjustedSub-blocks.

In the conventional retinex-based brightness adjustment algorithm, brightness adjustment processing is generally performed in the same color space, for example, brightness adjustment processing is performed only in an RGB color space, but such processing manner easily causes insufficient distribution of the number of pixel levels in the obtained image, that is, causes deterioration of the dynamic range of the obtained image. Therefore, the invention calculates in Lab color space and gray space respectively, especially adds average image, limits the number of intervals in the average image, so that the reduction amplitude of the number of pixel value levels is effectively delayed while the distribution of illumination is balanced, and the dynamic range of the balanced image is maintained as much as possible.

Preferably, the retinex model is based, and the acquisition is performedimgGraCorresponding illumination imageimgLiComprising:

for a pair ofimgGraPerforming Gaussian filtering to obtain illumination imageimgLi。

Preferably, the said willimgLigAnd (3) withimgLiPerforming subtraction of the image to obtainimgLigCorresponding reflected imageimgDcComprising:

the subtraction is performed using the following formula:

logimgDc=logimgLig-logimgLi。

preferably, the said willimgAveAndimgDcperforming addition operation on the image to obtain the imageimgRtComprising:

the following formula is adopted for addition:

logimgRt=logimgAve+logimgDc。

preferably, the said willimgGraConversion to mean imageimgAveComprising:

will beimgGraThe maximum value and the minimum value of the middle gray value are respectively recorded asmaGraAndmiGra；

the gray value interval is calculated as follows:

wherein, the liquid crystal display device comprises a liquid crystal display device,flGrathe gray value interval is represented and,Qrepresenting the number of intervals;

pixel value is set at

The pixel values of the pixel points within the range are uniformly set to +.>

，/>

I is an integer, ">

Representing pixel values at

An average value of pixel values of pixel points within the range.

By setting the number of Q, a limitation on the pixel level in the image is achieved, so that the dynamic range of the equalized image is maintained.

Preferably, the S31 includes:

using Laplace edge detection operator pairsaPEdge detection is carried out to obtain edge pixel points;

storing all obtained edge pixel points into a setoneSet。

Specifically, in addition to the laplace edge detection operator, other algorithms capable of implementing edge detection may be used to obtain edge pixel points.

Preferably, the S32 includes:

for edge pixel pointsblPixToblPixCentering on, will be of the sizeH×HIs a window of (2)WAs a means ofblPixIs a noise reduction window of (a);

will be in the windowWPixel points in the pixel array asblPixAnd the pixel points in the corresponding noise reduction window.

The noise reduction window is obtained, and a reference neighborhood can be provided for noise reduction of the edge pixel points.

Preferably, the S36 includes:

for the first imageonePImproved wavelet noise reduction algorithm paironePNoise reduction processing is carried out to obtain a first image after noise reductionafOneP；

For the second imagetwoPAdopts a mixed noise reduction algorithm pairtwoPNoise reduction processing is carried out to obtain a second image after noise reductionafTwoP。

In the invention, based on the characteristics of the edge pixel points and the common pixel points, different noise reduction modes are adopted to carry out noise reduction treatment, thereby improving the accuracy of the noise reduction treatment result. For the detected edge pixel points, as noise pixel points possibly exist, the noise detection method and the device adopt a wavelet noise reduction algorithm with higher noise detection accuracy and longer time consumption to perform noise reduction processing, so that the problem that edge details are blurred due to the fact that the edge pixel points are taken as the noise pixel points to perform noise reduction is avoided. For the common pixel points, the invention adopts a mixed noise reduction algorithm, and the conversion of a calculation domain is not involved, so the efficiency is higher. However, the invention adopts a noise reduction algorithm which is not a single noise reduction algorithm, but firstly carries out noise reduction by taking a middle value to obtain a middle image, then carries out salient detection based on the middle image, and carries out noise reduction treatment on salient pixel points, thereby avoiding noise reduction treatment on all the pixel points for the second time while improving the accuracy of the noise reduction result and improving the noise reduction treatment efficiency.

Preferably, the pair of improved wavelet noise reduction algorithmsonePNoise reduction processing is carried out to obtain a first image after noise reductionafOnePComprising:

for a pair ofonePPerforming wavelet decomposition to obtain high-frequency wavelet coefficientshwcfAnd low frequency wavelet coefficientslwcf；

The following method is used for high-frequency wavelet coefficienthwcfProcessing to obtain processed high-frequency wavelet coefficientdhwcf：

If it is

Then the following improved threshold formula pair is adoptedhwcfPerforming treatment：

If it is

Then the following improved threshold formula pair is adoptedhwcfAnd (3) performing treatment:

wherein, the liquid crystal display device comprises a liquid crystal display device,gthrerepresenting the wavelet noise reduction threshold value,sgnrepresenting a sign function, c representing a constant parameter,

representing a first control factor, ">

，hpixetRepresentation ofonePThe difference coefficient of the pixel points with the middle pixel value larger than T,alpixetrepresentation ofonePThe difference coefficient of all pixel points in the pixel array is T which is the difference coefficient of all pixel points by using the Ojin methodonePDividing threshold value obtained by calculation, < >>

Representing a second control factor, ">

；

Will bedhwcfAndlwcfreconstructing to obtain a first image after noise reductionafOneP。

Compared with the existing wavelet noise reduction algorithm, the invention introduces the difference coefficient in the setting of the threshold value formula, and the larger the difference coefficient is, the representationonePThe more complex the distribution of pixel values, the greater the possibility of containing noise pixel points, and the threshold formula will be drawn toward the soft threshold function to improve the noise reduction effectonePThe simpler the distribution of pixel values in (a) is, the less likely the noise pixel is to be contained, and the threshold is at this timeThe value formula is drawn towards the hard threshold function, more detail information is reserved, and therefore the balance between the noise reduction effect and the noise reduction efficiency is achieved.

Preferably, the method comprises the steps of,hpixetthe following formula is adopted for calculation:

wherein, the liquid crystal display device comprises a liquid crystal display device,bturepresentation ofonePA set of all pixel levels in a pixel point where the middle pixel value is greater than T,num _u representing the number of pixels at pixel level u,numbturepresentation ofonePThe pixel value in (a) is greater than the total number of pixels of T.

Preferably, the method comprises the steps of,alpixetthe following formula is adopted for calculation:

wherein, the liquid crystal display device comprises a liquid crystal display device,alurepresentation ofonePA set of all pixel levels in (a),num _v representing the number of pixels at pixel level v,numalurepresentation ofonePThe total number of pixels in the display.

Preferably, the pair of mixed noise reduction algorithms is adoptedtwoPNoise reduction processing is carried out to obtain a second image after noise reductionafTwoPComprising:

respectively pairs using the following formulastwoPNoise reduction processing is carried out on each pixel point in the image to obtain an intermediate image F:

wherein, the liquid crystal display device comprises a liquid crystal display device,F(a) Representation pairtwoPIn (3) pixelsaAfter noise reduction treatment, pixel pointsaIs used for the display of the display panel,nei _a representing pixel pointsaA set of pixel points in 8-neighborhood,midFli(nei _a ) Representation pairnei _a The pixel values of the pixel points in the array are ordered and then takennei _a A median value of pixel values of the pixel points in (a);

performing salient detection on pixel points in the intermediate image F to obtain a set of salient pixel pointstcSet；

The DnCNN noise reduction algorithm is adopted to respectively pair and collecttcSetEach pixel point in the image is subjected to noise reduction processing to obtain a second image after noise reductionafTwoP。

Preferably, the saliency detection is performed on the pixels in the intermediate image F to obtain a set of saliency pixelstcSetComprising:

for a pixel point b in the intermediate image F, a salient detection parameter of b is calculated using the following formulatcpix(b)：

Wherein, the liquid crystal display device comprises a liquid crystal display device,F(b) Representing the pixel value of pixel b in F,twoP(b) Indicating that pixel b is attwoPPixel values of corresponding pixel points in the display screen;

if it istcpix(b) If the pixel point b is larger than the set threshold value, the pixel point b is a protruding pixel point.

Preferably, the S4 includes:

three-dimensional reconstruction algorithm pair based on 3DMM modelbPPerforming three-dimensional reconstruction to obtain a three-dimensional face model;

acquisition ofbPIs a face texture map.

Specifically, the three-dimensional reconstruction algorithm may be a 2DASL algorithm, an FML algorithm, or the like.

Preferably, the S5 includes:

acquiring the topological structure of the face of the digital person;

registering the three-dimensional face model with the topological structure of the face of the digital person to obtain a registered face model;

acquiring a target face model with skeleton points based on the registered face model;

and applying the face texture map to the target face model to generate the face of the digital person.

Specifically, in the present invention, after obtaining the noise reduction image, the step of generating the digital person adopts the steps already existing in the prior art.

Since the topology of the face of the digital person is not the same as the topology in the three-dimensional face model, it is necessary to register vertices in the three-dimensional face model with vertices in the topology of the face of the digital person.

The effect of the skeletal points is to cause the expression in the face of the digital person to change as the facial expression changes.

The face texture map is applied to a target face model, the face texture map and a target topological structure are registered to obtain a registered face texture map, and then the registered face texture map can be applied to the target face model to generate the face of a digital person.

After the registration of the face model and the registration of the face texture mapping, when the face of the corresponding digital person is required to be produced by continuously carrying out the production on a plurality of face images, the expression in the digital person can be changed along with the expression change in the plurality of face images.

the acquisition module is used for acquiring the real face imageP；

storing all pixel points in the noise reduction window into a settwoSet；

Will beafOnePAndafTwoPmerging to obtain a noise reduction imagebP；

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention. Furthermore, embodiments of the invention and features of the embodiments may be combined with each other without conflict.

Claims

1. A digital person generation method based on AI and image synthesis technology, comprising:

s1, acquiring a real face image

；

S2, to

Preprocessing to obtain preprocessed image +.>

；

S3, adopting the following method for

Noise reduction processing is performedObtaining a noise-reduced image->

：

S31, pair of

Edge detection is carried out to obtain a set of edge pixel points +.>

；

s33, storing all pixel points in the noise reduction window into a set

；

S34, will

In not belonging to the->

Pixel point storage set +.>

；

S35, by the collection

The pixels in (a) constitute a first image +.>

By the collection->

The pixels in (1) form a second image +.>

；

S36, respectively to

And->

Noise reduction is carried out on the pixel points in the image to obtain a first image after noise reduction +.>

And a second image after noise reduction +.>

；

S37, will

And->

Combining to obtain a noise-reduced image->

；

S4, to

Performing three-dimensional reconstruction to obtain a three-dimensional face model;

s5, generating a digital person based on the three-dimensional face model;

the step S2 comprises the following steps:

s21, dividing P into a plurality of sub-blocks with the same size;

s23, forming all sub-blocks with brightness adjusted into a preprocessing image

；

For sub-blocks

The following method is adopted for->

And (3) performing brightness adjustment processing:

acquisition of

Image corresponding to luminance component in Lab color space +.>

；

Acquisition of

Corresponding gray-scale image->

；

Acquisition based on retinex model

Corresponding illumination image->

；

Will be

And->

Performing subtraction of the image to obtain +.>

Corresponding reflection image +.>

；

Will be

Conversion to mean image>

；

Will be

And->

Performing addition operation of the image to obtain an image +.>

；

Will be

And converting back to the RGB color space to obtain the sub-block with the adjusted brightness.

2. The digital person generating method based on AI and image synthesis technology of claim 1, wherein S31 includes:

using Laplace edge detection operator pairs

Edge detection is carried out to obtain edge pixel points;

storing all obtained edge pixel points into a set

。

3. The digital person generating method based on AI and image synthesis technology of claim 1, wherein said S32 comprises:

for edge pixel points

To->

For the center, the size is +.>

Window of->

As->

Is a noise reduction window of (a);

will be in the window

The pixel points in the pixel array are taken as +.>

And the pixel points in the corresponding noise reduction window.

4. The digital person generating method based on AI and image synthesis technology of claim 1, wherein S36 includes:

for the first image

Improved wavelet noise reduction algorithm pair +.>

Performing noise reduction processing to obtain a first image +.>

；

For the second image

Hybrid noise reduction algorithm is adopted for ∈Reo>

Noise reduction processing is carried out to obtain a second image after noise reduction

。

5. The digital person generating method based on AI and image synthesis technology of claim 1, wherein S4 includes:

three-dimensional reconstruction algorithm pair based on 3DMM model

acquisition of

Is a face texture map.

6. The digital person generation method based on AI and image synthesis technology of claim 5, wherein S5 includes:

acquiring the topological structure of the face of the digital person;

7. The digital person generation system based on the AI and image synthesis technology is characterized by comprising an acquisition module, a preprocessing module, a noise reduction module, a reconstruction module and a generation module;

the acquisition module is used for acquiring the real face image