CN109741247A - A kind of portrait-cartoon generation method neural network based - Google Patents

A kind of portrait-cartoon generation method neural network based Download PDF

Info

Publication number
CN109741247A
CN109741247A CN201811631295.7A CN201811631295A CN109741247A CN 109741247 A CN109741247 A CN 109741247A CN 201811631295 A CN201811631295 A CN 201811631295A CN 109741247 A CN109741247 A CN 109741247A
Authority
CN
China
Prior art keywords
sequence
vector
face
points
portrait
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811631295.7A
Other languages
Chinese (zh)
Other versions
CN109741247B (en
Inventor
吕建成
汤臣薇
徐坤
贺喆南
李婵娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN201811631295.7A priority Critical patent/CN109741247B/en
Publication of CN109741247A publication Critical patent/CN109741247A/en
Application granted granted Critical
Publication of CN109741247B publication Critical patent/CN109741247B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Processing (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of portrait-cartoon generation methods neural network based, comprising the following steps: S1, the structure feature for extracting face face in real human face image, and it is converted into sequence signature data;S2, sequence signature data are input in trained Seq2Seq VAE model, generate corresponding exaggeration structure sequence point;S3, the exaggeration structure sequence point of generation is applied in real human face image, to the exaggerated deformation of real human face image;S4, by cartoon style be applied to exaggerated deformation after facial image on, generate portrait-cartoon.Present invention proposition in a creative way indicates human face structure feature with sequence signature, and the generation of exaggeration sequence is carried out using Seq2Seq VAE model, generates to be applied to caricature.The limitation of existing image interpretation method is overcome, the exaggeration portrait-cartoon of generation not only has humorous expansiveness while not damaging the identification of character, is also embodied in the drawing style of different caricature men.

Description

A kind of portrait-cartoon generation method neural network based
Technical field
The invention belongs to technical field of image processing, and in particular to a kind of portrait-cartoon generation side neural network based Method.
Background technique
Drawing portrait is until the modern times are still a kind of popular artistic expression.With machine vision correlation skill The continuous development of art, drawing portrait is in multimedias, personalization such as virtual reality, augmented reality and robot system for drawing portrait Amusement and internet etc. are widely used.In order to enhance the artistic expression of portrait, produced based on different art feature angles A plurality of types of art up portraits, such as sketch, cartoon, caricature are given birth to, caricature is obtained as a kind of common art form The concern and research of many scholars.
With the development of artificial intelligence, more and more scholars begin one's study artificial intelligence and art combination, that is, calculate Art.The rule for including in art can be quantified as mathematical relationship, for example, golden section by mathematics and statistics by us With stringent proportionality, artistry and concordance, and there is aesthetic value abundant.Meanwhile these mathematical relationships become Calculate a part on theory of art basis.When drawing is related to the expression of personage, there are many various forms of graphics arts.
As shown in Figure 1, portrait painting just includes exaggeration portrait-cartoon, sketch, cartoon and simple picture etc..It is unrestrained to exaggerate portrait It draws, as its name suggests, refers to exaggeration and deformation by face organ to express the significant difference of personage and public face.With it is realistic Sketch is compared, and exaggeration caricature adds humorous element on the basis of realistic.It is different from cartoon and simple picture, the caricature of exaggeration Both it can satisfy the enjoyment of caricature, while having retained the identification of personage.However, simple picture and cartoon etc. are simple about sketch Single art form, has there is a lot of research work.In contrast, only a few studies work concentrate on exaggeration portrait-cartoon life Cheng Shang.
The generation of exaggeration portrait-cartoon is considered as the conversion of the style from real face image to cartoon image.Image to figure The conversion of picture is a kind of popular visual problem, and target is the style characteristics of learning objective image, and outputs and inputs figure Mapping as between.Wherein, the generation based on convolutional neural networks (Convolution Neural Networks, CNN) is fought Network (Generative Adversarial Networks, GAN) is considered as one of most popular image interpretation method.But Be existing method can only texture to image and color convert, when there are the changes of picture material and geometry for task When, the convolutional neural networks method effect based on confrontation generation network is with regard to very unsatisfactory, and the generation for exaggerating portrait-cartoon just relates to And to picture material, that is, human face structure exaggerated deformation.
In order to which portrait pictures are converted into corresponding portrait-cartoon, a kind of method in the prior art is based on sample Method, give a face portrait pictures in this method, each face can all be broken down into different parts (such as nose Son, mouth etc.), for each part, corresponding caricature component in data set is searched for using characteristic matching, then by caricature Component combines building cartoon human face;Another method is the method based on face characteristic, defines active shape mould first Then type characteristic point is based on face and correlation, generate the exaggeration portrait in real human face image, then overstate from facial shape While opening with face exaggeration shape is obtained in face exaggeration, introducing " principle of contrast ";Finally, being generated in conjunction with image distortion method The exaggeration of facial image is drawn a portrait.
It is above-mentioned to need in the method based on sample according to different local facial features in the prior art, draw collect it is big The caricature component of amount builds database, and workload is huge and reply technical requirements are high;The face for piecing together out is relatively more fixed, Lack of diversity;And final effect is a kind of cartooning to original face, there is no to input face obvious characteristic into Row deformation, that is, do not meet the definition of exaggeration portrait-cartoon.Although and another method based on face characteristic can be in certain journey Original face is exaggerated on degree, but effect is unobvious, and does not have personage's identification, obtained caricature effect style list One.
Summary of the invention
For above-mentioned deficiency in the prior art, portrait-cartoon generation method solution neural network based provided by the invention The limitation of existing image interpretation method, obtained portrait-cartoon style are single in existing portrait-cartoon generation method of having determined The problem of.
In order to achieve the above object of the invention, the technical solution adopted by the present invention are as follows: a kind of portrait neural network based is unrestrained Draw generation method, comprising the following steps:
S1, the structure feature for extracting face face in real human face image, and convert the structural characterization data of extraction to Sequence signature data;
S2, sequence signature data are input in trained Seq2Seq VAE model, generate that facial image is corresponding overstates Open structure sequence point;
S3, using thin-plate spline interpolation technology, the exaggeration structure sequence point of generation is applied in real human face image, it is real Now to the exaggerated deformation of real human face image;
S4, it is unrestrained to generate portrait by cartoon style applied on the facial image after exaggerated deformation using CycleGAN technology It draws.
Further, in the step S1 structure feature of face face include face face contour structure characteristic and five Official's structure feature;
The step S1 specifically:
68 sequence of points of real human face image are extracted as the structural characterization data of face face, and according to each sequence The absolute coordinate for arranging point, obtains offset coordinates sequence of each sequence of points relative to previous sequence of points, the offset coordinates sequence Column are sequence signature data;
Wherein, each sequence of points is to be added to the sequence of points of state value, and the sequence of points for being added to state value is expressed as Q (x,y,p1,p2,p3);
Wherein x and y indicates the offset distance of the sequence of points in the x and y direction relative to previous sequence of points;
p1,p2,p3Indicate the binary system one-hot vector of three kinds of facial states;p1Indicate the sequence of points be face contour or The starting point of face, p2It indicates the sequence of points and previous sequence of points is homolog, p3Indicate that the sequence of points is in 68 sequence of points The last one point.
Further, the method for Seq2Seq VAE model is trained in the step S2 specifically:
A1, the positive sequence sequence of the structure feature of face face and its inverted sequence sequence inputting in a width real human face image are arrived In encoder, positive sequence feature vector is obtainedWith inverted sequence feature vectorAnd connect into final feature vector h;
A2, final feature vector h is each mapped to average vector μ and standard difference vector by two full connection layer networks σ, and sample and obtain the random vector z for making average vector μ and standard difference vector σ Normal Distribution;
A3, random vector z is input in decoder, obtains the Seq2Seq VAE network of initial training;
In A4, the Seq2Seq VAE network once trained before being successively input to several width real human face images, repeat to walk Rapid A1-A3 obtains trained Seq2Seq VAE model until Seq2Seq VAE network convergence.
Further, the encoder in the step A1 includes a two-way LSTM network module, the two-way LSTM net Network module includes the LSTM network that two numbers of plies are 68.
Further, the step A1 specifically:
By the positive sequence sequence of the structure feature of face face in a width real human face imageIn each data be input to one In a LSTM network, positive sequence feature vector is obtainedSimultaneously by a width real human face image structure feature of face face it is anti- Sequence sequenceEach data be input in another LSTM network, obtain inverted sequence feature vectorBy positive sequence feature vectorWith Inverted sequence feature vectorConnect into final feature vector h;
Wherein, positive sequence sequence
Inverted sequence sequence
Wherein, i=0,1,2...67.
Further, in the step A2;
Random vector z are as follows:
Z=μ+σ ⊙ N (0,1)
Wherein, ⊙ indicates vector point multiplication;
N (0,1) is IID Gauss vector.
Further, in the step A3, the decoder is the LSTM network that a time span is 68;
The each moment input element of LSTM network further includes the vector T obtained from previous momenttWith source point St
The equal output vector O of output end at LSTM network each momentt, and the output vector O of current time ttPass through height This mixed model samples to obtain vector Tt, and it is input to the LSTM network of subsequent time;
Wherein, the t expression moment, t=0,1,2...67;
Initial time is input to the vector T in LSTM network0With source point S0It is initialized as (0,0,1,0,0).
Further, the output vector O of current time ttIt samples to obtain vector T by gauss hybrid modelstMethod it is specific Are as follows:
B1, the quantity N for determining normal distribution in gauss hybrid models, by output vector OtDimension be set as 6N, by OtPoint Solution are as follows:
Wherein, n indicates n-th of gauss hybrid models;
X indicates abscissa;
Y indicates ordinate;
wnIndicate the weight matrix of n-th of gauss hybrid models, and
μ(x,n)Indicate the expectation of abscissa x;
μ(y,n)Indicate the expectation of ordinate y;
σ(x,n)Indicate the standard deviation of abscissa x;
σ(y,n)Indicate the standard deviation of ordinate y;
ρ (xy, n) indicates related coefficient;
B2, the O after decomposing is determinedtWhen being input to gauss hybrid models, sampling obtains TtProbability p (x, y;t);
Wherein, Probability p (x, y;T) are as follows:
Wherein, w (n, t) indicates the weight matrix of n-th of gauss hybrid models of t moment;
N (x, y) indicates coordinate (x, y) Normal Distribution, parameter μ, σ, ρ;
The expectation of μ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The expectation of μ (y, n, t) expression n-th of Gauss model ordinate of t moment;
The standard deviation of σ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The standard deviation of σ (y, n, t) expression n-th of Gauss model ordinate of t moment;
ρ (xy, n, t) indicates related coefficient;
B3, by Probability p (x, y;T) it brings into reconstructed error function, obtains reconstructed error, maximizing reconstructed error makes Gauss Mixed model exports object vector Tt
Wherein, reconstructed error function are as follows:
Wherein, LRFor reconstructed error;
The transverse and longitudinal coordinate of (x, y) expression characteristic point.
The invention has the benefit that portrait-cartoon network generation method neural network based provided by the invention, is opened It proposes that face structure feature is stored as sequence signature invasively, and then Seq2Seq VAE model can be used and carry out exaggeration sequence Generation, thus be applied to caricature generate;Overcome the limitation of existing image interpretation method, the exaggeration portrait-cartoon of generation There is humorous expansiveness while being not only embodied in the identification for not damaging role, be also embodied in the drawing of different caricature men Style.
Detailed description of the invention
Fig. 1 is portrait painting type schematic diagram in background of invention.
Fig. 2 is portrait-cartoon generation method implementation flow chart neural network based in the present invention.
It using face alignment techniques by human face structure Feature Conversion is sequence signature that Fig. 3, which is in embodiment provided by the invention, Representative schematic diagram.
Fig. 4 is Seq2Seq VAE model training method implementation flow chart in the present invention.
Fig. 5 is compared schematic diagram with its several variant for complete object L in embodiment provided by the invention.
Fig. 6 is the comparison result of different real face images in embodiment provided by the invention.
Fig. 7 is that the feature in embodiment provided by the invention between input face and corresponding " public face " compares signal Figure.
Fig. 8 is the part exaggeration result schematic diagram of original image in embodiment provided by the invention.
Fig. 9 is to be illustrated on the face using the Comparative result of different artistic styles in embodiment provided by the invention in exaggeration Figure.
Specific embodiment
A specific embodiment of the invention is described below, in order to facilitate understanding by those skilled in the art this hair It is bright, it should be apparent that the present invention is not limited to the ranges of specific embodiment, for those skilled in the art, As long as various change is in the spirit and scope of the present invention that the attached claims limit and determine, these variations are aobvious and easy See, all are using the innovation and creation of present inventive concept in the column of protection.
As shown in Fig. 2, a kind of portrait-cartoon generation method neural network based, comprising the following steps:
S1, the structure feature for extracting face face in real human face image, and convert the structural characterization data of extraction to Sequence signature data;
S2, sequence signature data are input in trained Seq2Seq VAE model, generate that facial image is corresponding overstates Open structure sequence point;
S3, using thin-plate spline interpolation technology, the exaggeration structure sequence point of generation is applied in real human face image, it is real Now to the exaggerated deformation of real human face image;
When using thin-plate spline interpolation technology, a cartesian coordinate system, independent variable x and letter are established on one piece of thin plate Numerical value y is distributed across the point on the coordinate system.Thin plate passes through all corresponding functional value y points after bending deformation, makes simultaneously Bending energy is minimum.This interpolating function is defined asConcrete form is as follows:
S4, it is unrestrained to generate portrait by cartoon style applied on the facial image after exaggerated deformation using CycleGAN technology It draws.
CycleGAN can be by study, by image from source domain X converting into target domain in the case where no paired data Y.Target is one mapping G:X → Y of study, by increased confrontation loss, so that the distribution for coming from the image of G (X) approaches point Cloth Y.Because this mapping is height underconstrained, by its inverse mapping F:Y → X of generation in turn, and consistency is introduced Loss is to constrain so that F (G (X)) ≈ X.
In above-mentioned steps S1, using face alignment techniques, to all real human faces in MMC data set and corresponding exaggeration Xiao As the extraction of face face the progress face mask and face structure feature point of caricature, the structure feature of face face includes face The contour structure characteristic and face structure feature of face;68 sequence of points are extracted to indicate the structure feature of face, and each sequence Column point is all indicated by absolute value coordinate (x', y');
Above-mentioned MMC data set is distorting mirror caricature data set, including a large amount of real human face images and correspondence of collection Exaggeration portrait-cartoon human face data.
Therefore, the step S1 specifically:
68 sequence of points of real human face image are extracted as the structural characterization data of face face, and according to each sequence The absolute coordinate for arranging point, obtains offset coordinates sequence of each sequence of points relative to previous sequence of points, the offset coordinates sequence Column are sequence signature data;
Wherein, in order to distinguish each face organ, each sequence of points is to be added to the sequence of points of state value, is added to shape The sequence of points of state value is expressed as Q (x, y, p1,p2,p3);For conventional rectangular image, use the lower left corner of image as coordinate system Origin, the sequence of points of extraction can be considered as being distributed in the point in first quartile;
Wherein x and y indicates the offset distance of the sequence of points in the x and y direction relative to previous sequence of points;
p1,p2,p3Indicate the binary system one-hot vector of three kinds of facial states;p1Indicate the sequence of points be face contour or The starting point of face, p2Indicate that the sequence of points is identical with previous sequence of points for homolog, p3Indicate that the sequence of points is 68 sequences Point in the last one point.Pass through this state value, so that it may which human face structure is divided into profile, eyebrow, eyes, nose and mouth Five is most of.Fig. 3 shows the several samples of MMC data set, and the 68 sequence point-renderings obtained based on face alignment method Personage's simple picture.
Since the real human face image face in MMC data set not is exact matching with portrait-cartoon is corresponded to, especially It is face and the ratio of entire picture and the angle of personage side, in order to reduce mistake caused by extracting data, is obtaining partially It moves before coordinate sequence, with first point S ' of face contour of real human face image0=(x '0,y′0) and the last one point S '16= (x′16,y′16) it is used as datum mark, by rotating caricature corresponding with scaling, until the 1st point of caricature and real human face face With the 16th point alignment, successively extract 68 sequence of points are corrected, then further according to the absolute coordinate of each sequence of points, Obtain offset coordinates sequence of each sequence of points relative to previous sequence of points.
In above-mentioned steps S2, Seq2Seq VAE network is trained using pairs of exaggeration point sequence data, when the network (corresponding caricature exaggeration sequence of points can be reconstructed according to the sequence signature data of real human face) after convergence, just with the net Network model generates the sequence of points of exaggeration caricature to the real human face characteristic sequence data of test;
As shown in figure 4, the method for training Seq2Seq VAE model in above-mentioned steps S2 specifically:
A1, the positive sequence sequence of the structure feature of face face and its inverted sequence sequence inputting in a width real human face image are arrived In encoder, positive sequence feature vector is obtainedWith inverted sequence feature vectorAnd connect into final feature vector h;
Wherein, encoder includes a two-way LSTM network module, and the two-way LSTM network module includes two numbers of plies It is 68 LSTM network;
Therefore, step A1 specifically:
By the positive sequence sequence of the structure feature of face face in a width real human face imageIn each data be input to one In a LSTM network, positive sequence feature vector is obtainedSimultaneously by a width real human face image structure feature of face face it is anti- Sequence sequenceEach data be input in another LSTM network, obtain inverted sequence feature vectorBy positive sequence feature vectorWith Inverted sequence feature vectorConnect into final feature vector h;
Wherein, positive sequence sequence
Inverted sequence sequence
Wherein, i=0,1,2...67.
A2, final feature vector h projects to average vector μ and standard difference vector σ, and sample obtain one make it is average to Measure the random vector z of μ and standard difference vector σ Normal Distribution;
Wherein, random vector z are as follows:
Z=μ+σ ⊙ N (0,1)
Wherein, ⊙ indicates vector point multiplication;
N (0,1) is IID Gauss vector.
There are divergence loss L between random vector z and IID Gauss vector N (0,1) distributionKL
Wherein, KL () indicates KL distance;
N (μ, σ) indicates Normal Distribution;
KL (A | | B) indicate the KL distance seeking distribution A and being distributed between B;
After random vector z is determined, divergence loss L can be calculatedKL, which can automatic reverse propagation training LSTM network structure keeps the subsequent difference for inputting obtained z and Gauss vector N (0,1) distribution smaller and smaller.
A3, random vector z is input in decoder, obtains the Seq2Seq VAE network of initial training;
Wherein, decoder is the LSTM network that a time span is 68;
The each moment input element of LSTM network further includes the vector T obtained from previous momenttWith source point St
The equal output vector O of output end at LSTM network each momentt, and the output vector O of current time ttPass through height This mixed model samples to obtain vector Tt, and it is input to the LSTM network of subsequent time;
Wherein, the t expression moment, t=0,1,2...67;
Initial time is input to the vector T in LSTM network0With source point S0To be initialized as (0,0,1,0,0);
In above process, since the output of each moment LSTM network is Ot, subsequent time cannot be directly inputted to LSTM network in, it is therefore desirable to by OtParameter needed for being decomposed into gauss hybrid models, then obtains T againt, binary normal state point Cloth is determined by five elements: (μxyxyxy), wherein μxAnd μyIndicate average value, σxAnd σyIndicate standard deviation, ρxyTable Show relevant parameter.For two-variable normal distribution, there are also weight w;Therefore, the GMM model with N number of normal distribution needs (5+1) N A parameter.For the sequence of points of every face face, state value (p1,p2,p3) it is fixed, therefore do not need to generate them.
Wherein, the output vector O of current time ttIt samples to obtain vector T by gauss hybrid modelstMethod specifically:
B1, the quantity N for determining normal distribution in gauss hybrid models, by output vector OtDimension be set as 6N, by OtPoint Solution are as follows:
Wherein, n indicates n-th of gauss hybrid models;
X indicates abscissa;
Y indicates ordinate;
wnIndicate the weight matrix of n-th of gauss hybrid models, and
μ(x,n)Indicate the expectation of abscissa x;
μ(y,n)Indicate the expectation of ordinate y;
σ(x,n)Indicate the standard deviation of abscissa x;
σ(y,n)Indicate the standard deviation of ordinate y;
ρ (xy, n) indicates related coefficient;
B2, the O after decomposing is determinedtWhen being input to gauss hybrid models, sampling obtains TtProbability p (x, y;t);
Wherein, Probability p (x, y;T) are as follows:
Wherein, w (n, t) indicates the weight matrix of n-th of gauss hybrid models of t moment;
N (x, y) indicates coordinate (x, y) Normal Distribution, parameter μ, σ, ρ;
The expectation of μ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The expectation of μ (y, n, t) expression n-th of Gauss model ordinate of t moment;
The standard deviation of σ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The standard deviation of σ (y, n, t) expression n-th of Gauss model ordinate of t moment;
ρ (xy, n, t) indicates related coefficient;
B3, by Probability p (x, y;T) it brings into reconstructed error function, obtains reconstructed error, maximizing reconstructed error makes Gauss Mixed model exports object vector Tt
Wherein, reconstructed error function are as follows:
Wherein, LRFor reconstructed error;
The transverse and longitudinal coordinate of (x, y) expression characteristic point.
In A4, the Seq2Seq VAE network once trained before being successively input to several width real human face images, repeat to walk Rapid A1-A3 obtains trained Seq2Seq VAE model until Seq2Seq VAE network convergence.
In sequence Seq2Seq VAE network development process, there is also consistency to lose LC, which, which loses, uses production The log-likelihood of probability distribution explains source point S, and LCIt is related with the basic structure of face is maintained;
Consistency loses LCAre as follows:
There is source point S in each LSTM networkt, can automatic reverse tune from the obtained consistency loss in each LSTM The network structure of whole decoder, and then generate the structure sequence point of exaggeration.
In one embodiment of the invention, experimental result of the method for the present invention on MMC data set is provided:
1. the image in the MMC data set be divided into 500 training to and 47 tests pair, and it is true to add additional 100 Real face image;
2. 256 dimensional feature vectors are extracted from the source data of positive sequence and inverted sequence respectively for encoderWithThen, lead to The input that the 512 dimensional vector h that connection obtains are used as VAE is crossed, and the dimension of vector z is 128.GMM (Gaussian Mixture mould is set Type) it is 20 normal distributions, the output dimension of the LSTM of decoder is 120;
3. studying Kullback-Leibler divergence loss LKL, reconstructed error LRAnd consistency loses LCImportance. Then complete method is compared with several variants, later, influence of the analysis batch size to the exaggeration sketch of generation passes through Part is exaggerated to improve the validity of system.Finally, the portrait of exaggeration is transferred in various artistic styles.
Experimental result and analysis:
A. loss function is analyzed:
In Fig. 5, complete object L is compared with its several variant, one is Kullback-Leibler diverging damage Lose LKLWith reconstructed error LR, the other is reconstructed error LRAnd consistency loses LC.All Seq2Seq VAE models of training, batch Amount size is 64 samples, and using Adam optimization algorithm, learning rate 0.01, gradient is reduced to 1.0.
(1) in LKLAnd LRIn experiment, α=0.8 and β=0 are set;
(2) in LRAnd LCIn experiment, α=0 and β=2 are set;
(3) in LKL+LR+LCIn experiment, α=0.5 and β=0.5 are set.
The experimental results showed that three kinds of losses all play a crucial role the result for obtaining high quality.From the second row Primal sketch and the third line exaggeration sketch, we can be found that LKLThe sketch generated can be made more to exaggerate, and original The reservation of face structure, which mainly passes through, minimizes LcLoss realize.Complete model minimizes L simultaneouslyKL, LR, LC, protected to reach It holds the basic structure of original image and exaggerates primal sketch.Complete Seq2Seq VAE model not only exaggerates facial characteristics, also Remain the identification to role.But complete model there are still some drawbacks.The model is special to certain discriminations of original image The only slight exaggeration of sign, does not reach ideal exaggeration.
B. batch size is analyzed:
The variation of batch size may cause network and vibrate between randomness and certainty, in this Seq2Seq VAE with Machine and deterministic most apparent performance are whether the sketch generated distorts or restore.Different real face images in Fig. 6 Comparison result shows that batch size directly affects the stability of institute's formation sequence.
As shown in fig. 6, the degree of randomness in network generated is much larger than stability when batch size is equal to 16, Lead to the serious distortion of generated image.From the second row and the simple sketch of fourth line, it is also apparent that when batch Hour, the deformation of face structure is very serious.With the increase of batch size, the stability of network is also increase accordingly, generation Sequence and the sequence of source images are more consistent.For example, the exaggeration degree on source images is very flat when batch size is equal to 128 It is slow, in order to which when model exaggerates source images, it can not only maintain the basic structure of facial characteristics, it is bright that its can also be exaggerated significantly Aobvious feature;Therefore, in other experiments, 64 are set by batch size.
C. part is exaggerated:
In general, when drawing portrait-cartoon, artist often exaggerates the obvious characteristic for being different from " public face ".Cause This, in the system of proposition, the method for the present invention proposes local exaggeration method.Pass through " the masses of the masculinity femininity of country variant The ratio of face " data set, the face and " public face " that compare input is distributed, and the obvious characteristic of object can be obtained, such as Fig. 7 institute Show, it can be seen that the feature comparative example between input face and corresponding " public face ".
By inverting x, the y-axis value of corresponding local coordinate point, progress is corresponding to be changed to exaggerate these local face organs, The influence that the system generates caricature can be further enhanced.But there are still some problems, cannot mention in face's alignment step Hair style, forehead, ear and the cheek of people are taken, therefore these local features can not be compared and exaggerate.Judging from the experimental results, This method can reasonably exaggerate the feature of extraction.Such as the part exaggeration result that Fig. 8 is original image.First row is original graph Picture.Secondary series is the feature of structure distribution.Bluepoint is the structure in original face, and Huang point is the structure after local directed complete set.When local change When changing applied to original face, third column deformation result can be obtained.4th column caricature is corresponding target.Although result cannot reach The effect exported to target, but still certain exaggeration humorous effect can be obtained.
Final result:
The exaggeration portrait-cartoon of generation has humorous expansiveness while being not only embodied in the identification for not damaging role, It is embodied in the drawing style of different caricature men;Pass through the different style of CycleGAN training, such as cartoon style, oil painting Style, sketch style, cartoon style etc..Fig. 9 shows the result for using different artistic styles on the face in exaggeration.
The invention has the benefit that portrait-cartoon network generation method neural network based provided by the invention, is opened It proposes that face structure feature is stored as sequence signature invasively, and then Seq2Seq VAE model can be used and carry out exaggeration sequence Generation, thus be applied to caricature generate.Overcome the limitation of existing image interpretation method, the exaggeration portrait-cartoon of generation There is humorous expansiveness while being not only embodied in the identification for not damaging role, be also embodied in the drawing of different caricature men Style.

Claims (8)

1. a kind of portrait-cartoon generation method neural network based, which comprises the following steps:
S1, the structure feature for extracting face face in real human face image, and sequence is converted by the structural characterization data of extraction Characteristic;
S2, sequence signature data are input in trained Seq2Seq VAE model, generate the corresponding exaggeration knot of facial image Structure sequence of points;
S3, using thin-plate spline interpolation technology, the exaggeration structure sequence point of generation is applied in real human face image, is realized pair The exaggerated deformation of real human face image;
S4, portrait-cartoon is generated by cartoon style applied on the facial image after exaggerated deformation using CycleGAN technology.
2. portrait-cartoon generation method neural network based according to claim 1, which is characterized in that the step S1 The structure feature of middle face face includes the contour structure characteristic and face structure feature of face face;
The step S1 specifically:
68 sequence of points of real human face image are extracted as the structural characterization data of face face, and according to each sequence of points Absolute coordinate, obtain offset coordinates sequence of each sequence of points relative to previous sequence of points, which is For sequence signature data;
Wherein, each sequence of points is to be added to the sequence of points of state value, be added to state value sequence of points be expressed as Q (x, y, p1, p2, p3);
Wherein x and y indicates the offset distance of the sequence of points in the x and y direction relative to previous sequence of points;
p1, p2, p3Indicate the binary system one-hot vector of three kinds of facial states;p1Indicate that the sequence of points is face contour or face Starting point, p2It indicates the sequence of points and previous sequence of points is homolog, p3Indicate that the sequence of points is last in 68 sequence of points One point.
3. portrait-cartoon generation method neural network based according to claim 1, which is characterized in that the step S2 The method of middle trained Seq2Seq VAE model specifically:
A1, by the positive sequence sequence of the structure feature of face face in a width real human face image and its inverted sequence sequence inputting to coding In device, positive sequence feature vector is obtainedWith inverted sequence feature vectorAnd connect into final feature vector h;
A2, final feature vector h is each mapped to average vector μ and standard difference vector σ by two full connection layer networks, and Sampling obtains the random vector Z for making average vector μ and standard difference vector σ Normal Distribution;
A3, random vector z is input in decoder, obtains the Seq2Seq VAE network of initial training;
In A4, the Seq2Seq VAE network once trained before being successively input to several width real human face images, step is repeated A1-A3 obtains trained Seq2Seq VAE model until Seq2Seq VAE network convergence.
4. portrait-cartoon generation method neural network based according to claim 3, which is characterized in that the step A1 In encoder include a two-way LSTM network module, it is 68 that the two-way LSTM network module, which includes two numbers of plies, LSTM network.
5. portrait-cartoon generation method neural network based according to claim 4, which is characterized in that the step A1 Specifically:
By the positive sequence sequence of the structure feature of face face in a width real human face imageIn each data be input to one In LSTM network, positive sequence feature vector is obtainedSimultaneously by the inverted sequence of the structure feature of face face in a width real human face image SequenceEach data be input in another LSTM network, obtain inverted sequence feature vectorBy positive sequence feature vectorWith it is anti- Sequence characteristics vectorConnect into final feature vector h;
Wherein, positive sequence sequence
Inverted sequence sequence
Wherein, i=0,1,2 ... 67.
6. portrait-cartoon generation method neural network based according to claim 3, which is characterized in that the step A2 In;
Random vector z are as follows:
Z=μ+σ ⊙ N (0,1)
Wherein, ⊙ indicates vector point multiplication;
N (0,1) is IID Gauss vector.
7. portrait-cartoon generation method neural network based according to claim 3, which is characterized in that the step A3 In, the decoder is the LSTM network that a time span is 68;
The each moment input element of LSTM network further includes the vector T t and source point S obtained from previous momentt
LSTM network each moment equal output vector Ot, and the output vector O of current time ttIt is adopted by gauss hybrid models Sample obtains vector Tt, and it is input to the LSTM network of subsequent time;
Wherein, the t expression moment, t=0,1,2 ... 67;
Initial time is input to the vector T in LSTM network0With source point S0It is initialized as (0,0,1,0,0).
8. portrait-cartoon generation method neural network based according to claim 7, which is characterized in that current time t Output vector OtIt samples to obtain vector T by gauss hybrid modelstMethod specifically:
B1, the quantity N for determining normal distribution in gauss hybrid models, by output vector OtDimension be set as 6N, by OtIt decomposes Are as follows:
Wherein, n indicates n-th of gauss hybrid models;
X indicates abscissa;
Y indicates ordinate;
wnIndicate the weight matrix of n-th of gauss hybrid models, and
μ(x, n)Indicate the expectation of abscissa x;
μ(y, n)Indicate the expectation of ordinate y;
σ(x, n)Indicate the standard deviation of abscissa x;
σ(y, n)Indicate the standard deviation of ordinate y;
ρ (xy, n) indicates related coefficient;
B2, the O after decomposing is determinedtWhen being input to gauss hybrid models, sampling obtains TtProbability p (x, y;t);
Wherein, Probability p (x, y;T) are as follows:
Wherein, w (n, t) indicates the weight matrix of n-th of gauss hybrid models of t moment;
N (x, y) indicates coordinate (x, y) Normal Distribution, parameter μ, σ, ρ;
The expectation of μ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The expectation of μ (y, n, t) expression n-th of Gauss model ordinate of t moment;
The standard deviation of σ (x, n, t) expression n-th of Gauss model abscissa of t moment;
The standard deviation of σ (y, n, t) expression n-th of Gauss model ordinate of t moment;
ρ (xy, n, t) indicates related coefficient;
B3, by Probability p (x, y;T) it brings into reconstructed error function, obtains reconstructed error, maximizing reconstructed error makes Gaussian Mixture Model exports object vector Tt
Wherein, reconstructed error function are as follows:
Wherein, LRFor reconstructed error;
The transverse and longitudinal coordinate of (x, y) expression characteristic point.
CN201811631295.7A 2018-12-29 2018-12-29 Portrait cartoon generating method based on neural network Active CN109741247B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811631295.7A CN109741247B (en) 2018-12-29 2018-12-29 Portrait cartoon generating method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811631295.7A CN109741247B (en) 2018-12-29 2018-12-29 Portrait cartoon generating method based on neural network

Publications (2)

Publication Number Publication Date
CN109741247A true CN109741247A (en) 2019-05-10
CN109741247B CN109741247B (en) 2020-04-21

Family

ID=66362127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811631295.7A Active CN109741247B (en) 2018-12-29 2018-12-29 Portrait cartoon generating method based on neural network

Country Status (1)

Country Link
CN (1) CN109741247B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197226A (en) * 2019-05-30 2019-09-03 厦门大学 A kind of unsupervised image interpretation method and system
CN111127309A (en) * 2019-12-12 2020-05-08 杭州格像科技有限公司 Portrait style transfer model training method, portrait style transfer method and device
CN111161137A (en) * 2019-12-31 2020-05-15 四川大学 Multi-style Chinese painting flower generation method based on neural network
CN111243050A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait simple stroke generation method and system and drawing robot
CN111243051A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait photo-based stroke generating method, system and storage medium
CN111402394A (en) * 2020-02-13 2020-07-10 清华大学 Three-dimensional exaggerated cartoon face generation method and device
CN111508048A (en) * 2020-05-22 2020-08-07 南京大学 Automatic generation method for human face cartoon with interactive arbitrary deformation style
CN112241704A (en) * 2020-10-16 2021-01-19 百度(中国)有限公司 Method and device for judging portrait infringement, electronic equipment and storage medium
CN112396693A (en) * 2020-11-25 2021-02-23 上海商汤智能科技有限公司 Face information processing method and device, electronic equipment and storage medium
CN112463912A (en) * 2020-11-23 2021-03-09 浙江大学 Raspberry pie and recurrent neural network-based simple stroke identification and generation method
CN112818118A (en) * 2021-01-22 2021-05-18 大连民族大学 Reverse translation-based Chinese humor classification model
CN113158948A (en) * 2021-04-29 2021-07-23 宜宾中星技术智能***有限公司 Information generation method and device and terminal equipment
CN113743520A (en) * 2021-09-09 2021-12-03 广州梦映动漫网络科技有限公司 Cartoon generation method, system, medium and electronic terminal
CN117291138A (en) * 2023-11-22 2023-12-26 全芯智造技术有限公司 Method, apparatus and medium for generating layout elements

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030206171A1 (en) * 2002-05-03 2003-11-06 Samsung Electronics Co., Ltd. Apparatus and method for creating three-dimensional caricature
KR20070096621A (en) * 2006-03-27 2007-10-02 (주)제이디에프 The system and method for making a caricature using a shadow plate
US7486296B2 (en) * 2004-10-18 2009-02-03 Reallusion Inc. Caricature generating system and method
CN101477696A (en) * 2009-01-09 2009-07-08 彭振云 Human character cartoon image generating method and apparatus
CN101551911A (en) * 2009-05-07 2009-10-07 上海交通大学 Human face sketch portrait picture automatic generating method
CN103116902A (en) * 2011-11-16 2013-05-22 华为软件技术有限公司 Three-dimensional virtual human head image generation method, and method and device of human head image motion tracking
CN103218842A (en) * 2013-03-12 2013-07-24 西南交通大学 Voice synchronous-drive three-dimensional face mouth shape and face posture animation method
KR20130120175A (en) * 2012-04-25 2013-11-04 양재건 Apparatus, method and computer readable recording medium for generating a caricature automatically
CN104200505A (en) * 2014-08-27 2014-12-10 西安理工大学 Cartoon-type animation generation method for human face video image
CN104463779A (en) * 2014-12-18 2015-03-25 北京奇虎科技有限公司 Portrait caricature generating method and device
CN107730573A (en) * 2017-09-22 2018-02-23 西安交通大学 A kind of personal portrait cartoon style generation method of feature based extraction
CN108596024A (en) * 2018-03-13 2018-09-28 杭州电子科技大学 A kind of illustration generation method based on human face structure information
CN109308731A (en) * 2018-08-24 2019-02-05 浙江大学 The synchronous face video composition algorithm of the voice-driven lip of concatenated convolutional LSTM

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030206171A1 (en) * 2002-05-03 2003-11-06 Samsung Electronics Co., Ltd. Apparatus and method for creating three-dimensional caricature
US7486296B2 (en) * 2004-10-18 2009-02-03 Reallusion Inc. Caricature generating system and method
KR20070096621A (en) * 2006-03-27 2007-10-02 (주)제이디에프 The system and method for making a caricature using a shadow plate
CN101477696A (en) * 2009-01-09 2009-07-08 彭振云 Human character cartoon image generating method and apparatus
CN101551911A (en) * 2009-05-07 2009-10-07 上海交通大学 Human face sketch portrait picture automatic generating method
CN103116902A (en) * 2011-11-16 2013-05-22 华为软件技术有限公司 Three-dimensional virtual human head image generation method, and method and device of human head image motion tracking
KR20130120175A (en) * 2012-04-25 2013-11-04 양재건 Apparatus, method and computer readable recording medium for generating a caricature automatically
CN103218842A (en) * 2013-03-12 2013-07-24 西南交通大学 Voice synchronous-drive three-dimensional face mouth shape and face posture animation method
CN104200505A (en) * 2014-08-27 2014-12-10 西安理工大学 Cartoon-type animation generation method for human face video image
CN104463779A (en) * 2014-12-18 2015-03-25 北京奇虎科技有限公司 Portrait caricature generating method and device
CN107730573A (en) * 2017-09-22 2018-02-23 西安交通大学 A kind of personal portrait cartoon style generation method of feature based extraction
CN108596024A (en) * 2018-03-13 2018-09-28 杭州电子科技大学 A kind of illustration generation method based on human face structure information
CN109308731A (en) * 2018-08-24 2019-02-05 浙江大学 The synchronous face video composition algorithm of the voice-driven lip of concatenated convolutional LSTM

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
JUNHONG HUANG等: ""Cartoon-to-Photo Facial Translation with Generative Adversarial Networks"", 《PROCEEDINGS OF MACHINE LEARNING RESEARCH 95》 *
邓维等: ""利用图像变形生成个性化人脸卡通"", 《计算机工程与应用》 *
阎芳等: ""漫画风格的人脸肖像生成算法"", 《计算机辅助设计与图形学学报》 *
陈威华: ""基于图像变形的人体动画和人脸夸张"", 《中国优秀硕士学位论文全文数据库•信息科技辑》 *
陈文娟等: ""利用人脸特征及其关系的漫画夸张与合成"", 《计算机辅助设计与图形学学报》 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197226A (en) * 2019-05-30 2019-09-03 厦门大学 A kind of unsupervised image interpretation method and system
CN110197226B (en) * 2019-05-30 2021-02-09 厦门大学 Unsupervised image translation method and system
CN111127309B (en) * 2019-12-12 2023-08-11 杭州格像科技有限公司 Portrait style migration model training method, portrait style migration method and device
CN111127309A (en) * 2019-12-12 2020-05-08 杭州格像科技有限公司 Portrait style transfer model training method, portrait style transfer method and device
CN111161137A (en) * 2019-12-31 2020-05-15 四川大学 Multi-style Chinese painting flower generation method based on neural network
CN111243050A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait simple stroke generation method and system and drawing robot
CN111243051A (en) * 2020-01-08 2020-06-05 浙江省北大信息技术高等研究院 Portrait photo-based stroke generating method, system and storage medium
CN111243050B (en) * 2020-01-08 2024-02-27 杭州未名信科科技有限公司 Portrait simple drawing figure generation method and system and painting robot
CN111243051B (en) * 2020-01-08 2023-08-18 杭州未名信科科技有限公司 Portrait photo-based simple drawing generation method, system and storage medium
CN111402394A (en) * 2020-02-13 2020-07-10 清华大学 Three-dimensional exaggerated cartoon face generation method and device
CN111402394B (en) * 2020-02-13 2022-09-20 清华大学 Three-dimensional exaggerated cartoon face generation method and device
CN111508048A (en) * 2020-05-22 2020-08-07 南京大学 Automatic generation method for human face cartoon with interactive arbitrary deformation style
CN112241704A (en) * 2020-10-16 2021-01-19 百度(中国)有限公司 Method and device for judging portrait infringement, electronic equipment and storage medium
CN112241704B (en) * 2020-10-16 2024-05-31 百度(中国)有限公司 Portrait infringement judging method and device, electronic equipment and storage medium
CN112463912A (en) * 2020-11-23 2021-03-09 浙江大学 Raspberry pie and recurrent neural network-based simple stroke identification and generation method
CN112396693A (en) * 2020-11-25 2021-02-23 上海商汤智能科技有限公司 Face information processing method and device, electronic equipment and storage medium
CN112818118A (en) * 2021-01-22 2021-05-18 大连民族大学 Reverse translation-based Chinese humor classification model
CN112818118B (en) * 2021-01-22 2024-05-21 大连民族大学 Reverse translation-based Chinese humor classification model construction method
CN113158948A (en) * 2021-04-29 2021-07-23 宜宾中星技术智能***有限公司 Information generation method and device and terminal equipment
CN113743520A (en) * 2021-09-09 2021-12-03 广州梦映动漫网络科技有限公司 Cartoon generation method, system, medium and electronic terminal
CN117291138A (en) * 2023-11-22 2023-12-26 全芯智造技术有限公司 Method, apparatus and medium for generating layout elements
CN117291138B (en) * 2023-11-22 2024-02-13 全芯智造技术有限公司 Method, apparatus and medium for generating layout elements

Also Published As

Publication number Publication date
CN109741247B (en) 2020-04-21

Similar Documents

Publication Publication Date Title
CN109741247A (en) A kind of portrait-cartoon generation method neural network based
CN109508669B (en) Facial expression recognition method based on generative confrontation network
CN111145116B (en) Sea surface rainy day image sample augmentation method based on generation of countermeasure network
CN109376582A (en) A kind of interactive human face cartoon method based on generation confrontation network
CN107977629A (en) A kind of facial image aging synthetic method of feature based separation confrontation network
CN110378985A (en) A kind of animation drawing auxiliary creative method based on GAN
CN101826217A (en) Rapid generation method for facial animation
CN113807265B (en) Diversified human face image synthesis method and system
CN109903236A (en) Facial image restorative procedure and device based on VAE-GAN to similar block search
CN110188667B (en) Face rectification method based on three-party confrontation generation network
Souza et al. Efficient neural architecture for text-to-image synthesis
CN109753864A (en) A kind of face identification method based on caffe deep learning frame
CN110852935A (en) Image processing method for human face image changing with age
CN112837210B (en) Multi-shape variable-style face cartoon automatic generation method based on feature map segmentation
CN109325513B (en) Image classification network training method based on massive single-class images
CN107066979A (en) A kind of human motion recognition method based on depth information and various dimensions convolutional neural networks
CN111950430A (en) Color texture based multi-scale makeup style difference measurement and migration method and system
CN110415261B (en) Expression animation conversion method and system for regional training
Wu et al. Adversarial UV-transformation texture estimation for 3D face aging
CN112634456A (en) Real-time high-reality drawing method of complex three-dimensional model based on deep learning
Jia et al. Face aging with improved invertible conditional GANs
Lang et al. 3d face synthesis driven by personality impression
Hu et al. Research on Current Situation of 3D face reconstruction based on 3D Morphable Models
Zeng et al. Controllable face aging
CN113989444A (en) Method for carrying out three-dimensional reconstruction on human face based on side face photo

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant