CN111915693A

CN111915693A - Sketch-based face image generation method and system

Info

Publication number: CN111915693A
Application number: CN202010439641.2A
Authority: CN
Inventors: 高林; 傅红波; 苏万超
Original assignee: Institute of Computing Technology of CAS; City University of Hong Kong CityU
Current assignee: Institute of Computing Technology of CAS; City University of Hong Kong CityU
Priority date: 2020-05-22
Filing date: 2020-05-22
Publication date: 2020-11-10
Anticipated expiration: 2040-05-22
Also published as: CN111915693B

Abstract

The invention provides a face image generation method and a face image generation system based on a sketch, which comprise the following steps: extracting a plurality of parts of the human face in the hand-drawn sketch through a feature extraction network to obtain a plurality of feature vectors, and reasonably expressing each feature vector in a manifold space as an optimization vector; and decoding and mapping the optimized vector into a feature tensor by using a feature mapping network, splicing all feature tensors to obtain a complete human face feature tensor, and synthesizing the human face feature tensor into a human face image with high reality by using an image synthesis network. The invention aims to solve the problems of high quality requirement of input images and poor reality of generated results in the prior art, and provides a high-quality face image generation method and system based on sketch interaction.

Description

Sketch-based face image generation method and system

Technical Field

The present invention relates to computer graphics and computer vision, and more particularly to a method for synthesizing and editing a sketch of a face image.

Background

With the development of deep learning, image editing and translating work in the visual field is more and more, and the work of transferring from hand-drawn sketch images to real faces gradually appears. However, most of the existing solutions are based on professional draft images or contour maps extracted from real images as input, and the existing solutions do not perform well on draft images drawn by common users, and at present, no interactive system for combining facial images appears. The technology for generating the real face image from the sketch image has wide application in real life, and can be used for various aspects such as crime investigation, character design and the like. In the above applications, the existing research often adopts an image-to-image migration method, however, the method restricts the details of depicting the image of a person as much as possible, and when the depicted details are not fine enough, it is often difficult to generate a real face image, thereby limiting the range of users.

In view of the simplicity and the ease of use of sketches, which are often used to depict objects or face images, the prior art can implement the operation of editing the face images by using the sketches, but it is difficult to generate high-quality face images when only the sketches are input. Classical network models can enable the translation of sketch images into real images, however when the input is a rough or incomplete hand-drawn sketch, it is often difficult to generate real results. The prior art can complement a sketch and generate a real image, has good performance on simple objects such as pineapples and strawberries, but has poor performance on the generation of images migrated to human faces.

Disclosure of Invention

In order to solve the generation and interactive work from the hand-drawn sketch image to the real face image, the invention designs an interactive system based on the sketch. And (3) dividing the face image by utilizing a deep neural network, data prior and an optimization idea of local-global processing, and integrating the sub-regions to generate a high-quality face image after optimization processing. The invention aims to solve the problems of high quality requirement of input images and poor reality of generated results in the prior art, and provides a high-quality face image generation method and system based on sketch interaction. And designing a use software for the method, guiding/helping a user to carry out drawing creation in a shadow map form, and further generating a corresponding real image.

Aiming at the defects of the prior art, the invention provides a face image generation method based on a sketch, which comprises the following steps:

step 1, extracting a plurality of parts of a human face in a hand-drawn sketch through a feature extraction network to obtain a plurality of feature vectors, and reasonably expressing each feature vector in a manifold space as an optimization vector;

and 2, decoding and mapping the optimized vector into a feature tensor by using a feature mapping network, splicing all the feature tensors to obtain a complete face feature tensor, and synthesizing the face feature tensor into a face feature image by using an image synthesis network.

The face image generation method based on the sketch, wherein the step 1 comprises the following steps:

step 11, setting segmentation areas of a left eye area, a right eye area, a nose area and a mouth area of a face image in the hand-drawn sketch by adopting overlapping windows, obtaining other areas by reversely selecting the segmentation areas of the left eye area, the right eye area, the nose area and the mouth area, and independently extracting characteristics of each area c

Step 12, training the automatic coding network by taking the characteristics of each region as training data to obtain a coder model E^cThe automatic coding network comprises a multi-layer decoder and a multi-layer encoder, a full connection layer is arranged between the encoder and the decoder, and a residual block is added after each convolution or deconvolution operation of the encoder and the decoder to construct a hidden layer descriptor.

step 13, constructing a draft image data set S-S for training_iTo the training set of the hidden layer, extracting the characteristics of each picture in the training set, and for a certain area c, utilizing the encoder model E^cConstructing a set of hidden layer features for the training composition

The hidden layer feature set F^cAll the characteristic points are distributed in a low-dimensional manifold space M^cInternal;

step 14, inputting the hand-drawn sketch

Then, the encoder model E is used^cExtracting the feature vector of the corresponding region c

Searching and interpolating the characteristic vector of the c-th part

Projection onto manifold space M^cAbove, is represented as

To M^c；

The retrieval and interpolation method specifically comprises the following steps:

step 141, feature vector for region c

Searching the training set to the feature set F by calculating Euclidean distance between feature vectors^cThe most similar K samples

Represents a set of K most recent samples to represent

In manifold space M^cAdjacent features of (a);

step 142, solving the interpolation weight by the minimization problem as follows:

wherein

Is a sample

The weight of each component can be solved by solving the constraint least square of a single component;

step 143, weight of given solution

In manifold space M^cThe projected points on may be represented as:

wherein

Is that

For image synthesis.

The face image generation method based on the sketch, wherein the step 2 comprises the following steps:

step 21, mapping the optimized vector in the hidden layer space to multi-channel features to generate a three-dimensional feature tensor, and mapping the three-dimensional feature tensor of each region according to a freehand sketch

And splicing the exact positions of the middle human face components to obtain a complete human face characteristic tensor.

The face image generation method based on the sketch further comprises the following steps:

and 3, when a current drawing sketch is given, weighting and superposing the sketch image data set and K most similar sketch images of the drawing sketch together, displaying the sketch images on a drawing board in a form of a shadow image with the transparency of 30%, and updating the shadow image in real time when a user draws.

The invention also provides a face image generation system based on the sketch, which comprises the following steps:

the method comprises the following steps that a module 1 extracts a plurality of parts of a human face in a hand-drawn sketch through a feature extraction network to obtain a plurality of feature vectors, and reasonable expression of each feature vector in a manifold space is used as an optimization vector;

and the module 2 decodes and maps the optimized vector into a feature tensor by using the feature mapping network, combines all the feature tensors to obtain a complete face feature tensor, and combines the face feature tensor into a face feature image by using the image combining network.

The face image generation system based on sketch, wherein the module 1 includes:

the module 11 sets segmentation areas of a left eye area, a right eye area, a nose area and a mouth area of a face image in the hand-drawn sketch by adopting overlapping windows, obtains other areas by reversely selecting the segmentation areas of the left eye area, the right eye area, the nose area and the mouth area, and independently extracts features of each area c

The module 12 trains the automatic coding network by taking the characteristics of each region as training data to obtain a coder model E^cThe automatic coding network comprises a multi-layer decoder and a multi-layer encoder, and a full connection layer is arranged between the encoder and the decoder, and each convolution or reverse convolution of the encoder and the decoderThe convolution operation is followed by the addition of a block of residuals to build the hidden layer descriptor.

module 13, construct a draft image dataset S ═ S for training_iTo the training set of the hidden layer, extracting the characteristics of each picture in the training set, and for a certain area c, utilizing the encoder model E^cConstructing a set of hidden layer features for the training composition

module 14, input hand-drawing sketch

Searching and interpolating the characteristic vector of the c-th part

Projection onto manifold space M^cAbove, is represented as

To M^c；

The system for retrieving and interpolating specifically comprises:

module 141, feature vector for region c

Represents a set of K most recent samples to represent

In manifold space M^cAdjacent features of (a);

module 142 solves the interpolation weights by the minimization problem by:

wherein

Is a sample

module 143, weight given solution

In manifold space M^cThe projected points on may be represented as:

wherein

Is that

For image synthesis.

The face image generation system based on sketch, wherein the module 2 includes:

module 21, converting the hidden layer spaceThe optimized vector is mapped into the multi-channel features to generate a three-dimensional feature tensor, and the three-dimensional feature tensor of each region is mapped according to the hand-drawn sketch

The face image generation system based on sketch still includes:

and a module 3, when a current drawing sketch is given, displaying the sketch image data set and K most similar sketch images of the drawing sketch on a drawing board in a shadow map form with 30% transparency after weighted superposition, and updating the shadow map in real time when a user draws.

According to the scheme, the invention has the advantages that:

the system designed by the invention can realize the input of any hand-drawn sketch and automatically realize the shadow map guide and high-reality human face image generation operation.

The system flow chart is shown in fig. 1, and the system relates to a plurality of technologies such as image feature extraction, manifold projection of feature space, feature interpolation of feature space, high-quality image generation and interactive interface design.

As shown in fig. 2, an image generation effect after feature interpolation is performed in the feature space by looking up the sketch of the adjacent component in the feature space according to the present invention is shown. The first and fifth columns of the graph show two adjacent sketch images in a feature space, and the middle three columns of the graph are feature vectors which are uniformly interpolated and input into a decoder to decode the generated sketch images. As can be seen from the illustration results, the feature interpolation in the feature space can obtain a smooth image transition effect, thereby proving the continuity of the feature space distribution.

As shown in fig. 3, the generated effect of the present invention after adjusting the mixing weight of other regions is demonstrated. The upper left is a sketch image drawn by the user, and the rest five images are generated effects obtained by setting different mixing weight parameters in other areas. As can be seen from the figure, the generated image has higher correspondence to the input as the mixing weight coefficient is increased.

As shown in fig. 4, a comparison of the generated effects exhibited by the present invention after adjusting the mixing weights of multiple regions is shown. The first image shows the input sketch image, and the second image shows the effect generated directly by the input sketch. The third graph shows the effect of the generation after feature optimization, and the fourth graph shows the result of the generation after different pairs of mixing parameters are adjusted for different regions. It can be seen from the figure that the user can make a smooth adjustment between the optimized result and the user-generated result by adjusting the blending parameters.

As shown in FIG. 5, the effect of adding lines and editing details when the user draws is demonstrated. The image on the left part shows that the line constraint is increased along with the increase of the strokes of the user, the generated result is more similar, and the overall shadow of the generated image is changed. The right image shows the effect of the user on the detail adjustment, so that the method is free in detail control, and the shape expression of other areas cannot be influenced by the change of local detail lines.

As shown in fig. 6, the interactive interface design of the present invention is shown, in which there are two image display areas, a drawing board "SketchingCanvas" and a presentation board "SythesizedFace", which respectively display the sketch content drawn by the user and the generated image effect. Meanwhile, an adjusting bar is arranged on the upper right of the interface, and the user can adjust the weight of the corresponding area to realize the strong and weak influence of the user hand-drawn sketch on the generated result. Meanwhile, different controls are arranged above the interface, and a user can select a painting brush and an eraser and adjust the corresponding size. Whether to turn on the real-time generation function and the Shadow guidance function may also be decided by selecting "Realtime" and "Shadow".

As shown in fig. 7, it shows different image displays on the drawing board and the display board during the actual use process of the user. Wherein, on the drawing board, there is shadow guide, and the shadow map can be updated in real time according to the painting brush of the user. Meanwhile, the generated result can be displayed on the display board in real time, and the user can see the current generation effect.

As shown in fig. 8, the result generated when the present invention is actually used is shown, the first three lines are sketch images drawn by the user, and the second four lines are the corresponding face image generation results.

As shown in fig. 9, the effect generated when the interpolation is applied to the face image according to the present invention is shown. The left and right images are used, and the image in the middle is a face image obtained by interpolation, so that the smooth deformation sequence can be obtained when the face image interpolation is processed.

As shown in fig. 10, the effect of the present invention generated on face blending is shown, different parts of different faces are pieced together, sketch images of different parts are input, and a new complete face image is generated.

Drawings

FIG. 1 is a schematic flow chart of the system of the present invention;

FIG. 2 is a diagram of interpolation generation effects of different parts on a hidden layer;

FIG. 3 is a diagram showing the effect of setting different tuning mixing parameters in other areas after feature optimization;

FIG. 4 is a graph showing the effect of adjusting mixing parameters for different regions after feature optimization;

FIG. 5 is an add detail generate effect (left) and a partial detail edit generate effect (right) diagram;

FIG. 6 is an interactive interface effect display diagram;

FIG. 7 is a drawing board display diagram and a generated effect display diagram on the operation interface;

FIG. 8 is a production result display diagram;

FIG. 9 is a diagram of image interpolation generation effect;

fig. 10 is a diagram showing the mosaic composition effect of the face image.

Detailed Description

In the prior art, due to the fact that the quality of images input by users is not considered to be uneven, an untrained ordinary user is difficult to efficiently draw a face sketch image, and the input sketch information is not sufficiently expressed, so that the generated face image is poor in reality sense or cannot be generated; meanwhile, the common user often has difficulty in mastering the position proportion of facial features during drawing, and certain difficulty exists in drawing creation.

The inventor analyzes the structure of the facial features, and takes the distribution similarity of the facial features into consideration, so that the facial image is divided into five parts (left eye, right eye, nose, mouth and other regions), and each region is processed independently to better depict local details. The method aims to edit each region independently and prevent local editing from influencing the global effect. The local details can be better expressed by a separate operation after the face region is blocked. The effect of the partial editing can be referred to fig. 5.

When a user draws a sketch, the method can independently extract the features of each part, optimize the features by adopting a local linear embedding method (namely LLE (locally linear embedding), and generate a real face image by utilizing the optimized features. Meanwhile, in order to solve the problem that a common user cannot draw a reasonable face sketch, the invention designs an interface, when the user draws, the user can predict and recommend the sketch according to the drawing of the user, and the sketch is placed and drawn under a drawing board in a shadow mode so as to guide the user to draw. In consideration of different professional levels of users, the control strip is arranged, and the users can control the strength of the drawing sketch to the generation effect by adjusting the weight of each part.

The invention provides a high-quality face image generation method and system for sketch interaction, which comprises the following key points:

key point 1, face image feature extraction module

By adopting a method of processing the face image in a subarea manner, the face image is divided into 5 parts (a left eye, a right eye, a nose, a mouth and other parts) in view of the clear structure of the face, and the characteristics of each area image are independently extracted; to better control the details of each portion, the local features of each face portion are learned. And (3) independently training each region data by adopting an automatic encoding network (namely AE), and acquiring a corresponding feature vector of the sketch image in the hidden layer through an encoder.

Key point 2, feature optimization Module

In consideration of the diversity of sketch expression, in order to optimize the extracted features to be reasonable in manifold space, the feature projection module is utilized to project each part to the corresponding manifold distribution space, so that the optimized feature vectors conform to the reasonable expression of data, and the optimization of hidden layer features extracted from any sketch image in the reasonable expression space is realized.

Key point 3, feature mapping Module

Since each sub-region is independent of the feature extraction operation, after the optimized feature vector is obtained, the hidden layer feature needs to be decoded separately for each sub-region to obtain the respective feature tensor. And a certain replacing operation is adopted to realize the merging operation from the sub-region feature tensor to the complete human face feature map.

Key point 4, image composition Module

After the fused feature tensor is obtained, the image synthesis module converts the feature tensor into a real face image. The module adopts a conditional generation confrontation network structure, takes a face feature image as input, and realizes the generation of a high-quality face image through confrontation learning and image constraint.

Key point 5, design and display of Interactive interface

The interactive interface design supports users with different drawing levels to create, and can generate drawing results in real time. The shadow map is used for guiding the user in the drawing process, so that more layout and detail guidance is provided for the user. The user can select the corresponding function through the control in the toolbar, and the generation effect is adjusted by adjusting the strength of the control area constraint through the adjusting bar.

In order to make the aforementioned features and effects of the present invention more comprehensible, embodiments accompanied with figures are described in detail below.

As shown in fig. 1, the method of the present invention comprises:

s1: inputting user hand-drawn sketches

S2: drawing a sketch of a user's hand using a feature extraction network

Extracting to obtain c pairs of feature vectors of each part

S3: aligning feature vectors in a feature manifold space

Carrying out projection optimization to obtain a characteristic vector

In manifold space M^cReasonable expression of

Reasonable means that an expression closest to the real data distribution is found in the manifold space of the real data distribution;

s4: using the feature mapping network to optimize the feature vector

Decoding and mapping the feature tensor into a feature tensor, and splicing the feature tensor of each part to obtain a complete face feature tensor;

s5: and inputting a human face feature tensor by using an image synthesis network to generate a human face image with high reality as a human face image generation result of the sketch.

S6: interactive interface design and implementation

Wherein, the method in S2 includes:

s21: after the method and the device acquire the user hand-drawn sketch, the characteristics of the image need to be extracted. Processing the whole image to hardly express the diversity of face irrelevant pairs, and adopting a method of processing the face image in regions for this purpose; in view of the structure of the human faceClearly, the invention divides the face image into 5 parts (left eye, right eye, nose, mouth and other parts), and extracts the characteristic of each area image separately, which is marked as S^cAnd c is {1,2,3,4,5 }; in order to better process details among the parts, the invention adopts an overlapping window to set the segmentation areas of the first four parts (the selection of the area size is determined according to the distribution range of the corresponding parts of the human beings in the database), and for the selection of other parts, we select the face image with the first four parts removed.

S22: to better control the details of each part, learning the local features of each face part requires designing a feature extraction network and training for real-time operation optimization. The method comprises two parts of feature network and training data processing.

S221: the invention adopts automatic coding network (AE) to train each region data independently, and after finishing AE network training from draft image to draft image, we can train through encoder E_cTo obtain the corresponding feature vector of the sketch image in the hidden layer, and then the corresponding feature vector is processed by a decoder D_cTo realize the generation of the feature vector to the sketch image; for the design of the automatic coding network, the invention adopts the form of a five-layer decoder and a five-layer encoder, and adopts the form of a full connection layer between the encoder and the decoder to ensure that the hidden layer characteristics generated by each part are 512 dimensions. For the dimension of the hidden layer features, the hidden layer features (128,256,512) with different dimensions are tested, and experiments show that the expression effect of 512 dimensions is better for the detail reconstruction and expression of the sketch, and the reconstruction result is fuzzy due to the low-dimensional features. Also after trial and error we add a residual block to build the hidden layer descriptor after each convolution/deconvolution operation of the encoding/decoding layer. The residual block is added in the network layer, so that the information loss of the network can be reduced, and the expression effect of the network model is improved.

S222: when training the feature extraction network, firstly, training data needs to be constructed. Since there is no real face image dataset containing a sketch image in the future, the present invention performs boundary contour extraction on color images to simulate sketch images. For the construction of the sketch image data set, after line extraction is carried out by adopting Photoshop, line reduction is carried out by adopting a sketch simplification method. When the network parameters are trained, the training is carried out in an automatic supervision mode, the Mean Square Error (MSE) loss between the input sketch image and the reconstructed image is caused, and an Adam optimizer is used for carrying out optimization solution on the network parameters.

Wherein the method in S3 comprises:

s31: considering that the sketch image input by the user may be different from the real face image, we need to map it into the real distribution space (manifold space). For an input sketch image s, a feature extraction module is used for acquiring a feature vector corresponding to each part c

Given the randomness and diversity of sketch representations, the features extracted by an image may not fit into the feature distribution of a dataset. In order to optimize the extracted features to be reasonable in a manifold space and achieve the purpose that a reasonable feature expression can be obtained even if the drawing sketch image is incomplete, the invention designs a feature optimization module which can extract hidden layer features from any sketch image and optimize the hidden layer features in a reasonable expression space.

S32: first, a draft image dataset S-S needs to be constructed for training_iTo the hidden layer feature set. Extracting the characteristics of each picture in the training set, and for a certain part c, using an encoder model E trained in a characteristic extraction module^cTo construct a set of hidden layer features composed of a training set

The results shown in fig. 2 indicate that sketch images similar in feature space are similar in feature space pair distribution, so we consider all feature points F to be similar^cDistributed in a low-dimensional manifold space, defined as M^c。

S33: when inputting the sketch

Then, using the pre-trained encoder E^cExtracting the feature vector of the corresponding part c, defined as

In order to increase the reality of expressing a human face, under the assumption of local linearity, the feature vector of the c part is searched and interpolated by following the idea of LLE

Projection to popular space M^cAbove, is represented as

To M^c. The method comprises two parts of nearest point search and characteristic interpolation:

s331: feature vector for part c

The invention firstly finds the characteristic set F from the training set by calculating the Euclidean distance between the characteristic vectors^cThe K most similar samples. Experiments and comparisons show that when K is 10, the resultant can ensure both surface authenticity and proper variation. Definition of

To represent a set of K most recent samples,

to represent

In manifold space M^cOf the adjacent feature.

S332: using adjacent features

To the feature vector

A reconstruction representation is performed and a weight parameter solution is performed by minimizing the reconstruction error. This is equivalent to solving the interpolation weights by the following minimization problem:

wherein

Is a sample

Is determined. The weights of the components can be found by solving a constrained least squares problem for the individual components. Given the weight of the solution

In manifold space M^cThe projected points on may be represented as:

is that

The optimized feature vector can be transmitted to a feature mapping module and an image synthesis module for image synthesis.

Wherein, the method in S4 includes:

s41: given an input sketch image

Then, the feature extraction module can be used for extracting features and the feature projection module is used for projecting all parts to the corresponding spatial distribution manifold, so that the optimized feature vector is in a reasonable expression space. After the feature vectors of each part are acquired, the features are required to be used for synthesizing a complete face image, and if the features are directly decoded into sketch images for splicing, images appear at the joint of the regions, so that the generation effect is poor. Given that draft images have only one pass, the inconsistencies of adjacent components in the overlap region are difficult to resolve automatically by the sketch generation network. This forces us to map the eigenvectors in the hidden layer space into the multi-channel features (i.e., generate the three-dimensional feature tensor). It was found in experiments that this greatly improved the information flow, and effectively fused the feature map instead of simple stitching on the draft image, helping to resolve inconsistencies between parts. Because the features of different parts have different semantic meanings, a decoder model is designed for each part, the problem that the spatial distribution of high-dimensional features extracted from different parts is different is solved, and the conversion from a 512-dimensional feature vector to a three-dimensional feature tensor is realized. Each decoder consists of a full link layer and five decoding layers. Each feature tensor obtained by decoding has 32 channels and the size of the feature tensor is the same as the size of the area of the corresponding part in the sketch image.

S42: after the tensor representation of each part is obtained, the resultant feature tensors of the left eye, the right eye, the nose and the mouth need to be put back into the feature tensors of other parts according to the exact position of the face component in the input face sketch image, so as to retain the original spatial relationship between the face components. At the time of merging, we use a fixed merging order (i.e., left/right eye, nose, mouth, other parts) to merge the feature maps.

Wherein, the method in S5 includes:

after the feature tensor fused in S4 is acquired, the image synthesis module converts the feature tensor into a real face image. The module adopts a structure of a condition generation network, takes the feature tensor as condition input, and uses a discriminator model for countermeasure training during training. Our generation network comprises an encoder unit, a residual block and a decoder unit. In this module, we use a multi-scale discriminator network to discriminate between the different scale pairs of the generated images. In training, an input image and a generated image are down-sampled to different image sizes, and the images are discriminated by a discriminator corresponding to the image sizes. We use this to set up to implicitly learn high-level associations between components. During network training, parameter solution optimization is performed by adopting an Adam optimizer, and the feature extraction module and the image synthesis module are trained together through a network, so that generation from feature vectors to real pictures is realized. For the feature extraction module and the image synthesis module, in addition to generating network errors, we also add L1 errors between the real image and the reconstructed image to further guide the generator to ensure the pixel-level quality of the generated image. We use perceptual errors in the training of the discriminator to compare the feature differences between the real image and the generated image.

Wherein, the method in S6 includes:

s61: considering the user's drawing level to variance, it may be difficult for an ordinary user to accurately locate the user's five sense organs. In order to assist a user in drawing, particularly for a user with a low drawing level, a shadow-guided drawing interface is provided, and a shadow map can be updated in real time in the drawing process of the user, so that more layout and detail guidance can be provided for the user. When given the current drawing image

The K most similar sketch images in the training set S found in S33 are used. After the K most similar pictures are found, the sketch images are weighted and superposed together, and the weight of each picture is the weight in S3

The unitized values were displayed on a panel with 30% transparency. When the drawing is performed by the user,the shadow map is updated in real time. The resulting picture will be displayed to the right of the window.

S62: in consideration of the fact that a user with high drawing level can draw a face image well by himself, and the user can select whether to display a shadow image or not by controlling a control of the shadow above. When the user finishes drawing or wants to view the generated result in real time, the function to be changed or generated in real time can be selected by clicking the 'Convert' control or the 'Realtime' check box.

S62: considering that the difference of the user level may cause the expression difference of the details, the invention designs an adjusting strip to adjust the mixing parameter, and realizes the effect of the user controlling the influence of each part of drawing details on the generation result by adjusting the interpolation weight between the original image characteristics and the optimization characteristics, thereby increasing the controllability of the drawing details. Definition wb^cRepresenting the blending parameters of part c, the blended feature vector may be calculated as:

will be provided with

A new composite image is obtained by inputting to S4 and S5.

The following are system examples corresponding to the above method examples, and this embodiment can be implemented in cooperation with the above embodiments. The related technical details mentioned in the above embodiments are still valid in this embodiment, and are not described herein again in order to reduce repetition. Accordingly, the related-art details mentioned in the present embodiment can also be applied to the above-described embodiments.

The module 12 trains the automatic coding network by taking the characteristics of each region as training data to obtain a coder model E^cThe automatic coding network comprises a multi-layer decoder and a multi-layer encoder, a full connection layer is arranged between the encoder and the decoder, and a residual block is added after each convolution or deconvolution operation of the encoder and the decoder to construct a hidden layer descriptor.

module 14, input hand-drawing sketch

Then, the encoder die is usedForm E^cExtracting the feature vector of the corresponding region c

Searching and interpolating the characteristic vector of the c-th part

Projection onto manifold space M^cAbove, is represented as

To M^c；

The system for retrieving and interpolating specifically comprises:

module 141, feature vector for region c

Represents a set of K most recent samples to represent

In manifold space M^cAdjacent features of (a);

module 142 solves the interpolation weights by the minimization problem by:

wherein

Is a sample

module 143, weight given solution

In manifold space M^cThe projected points on may be represented as:

wherein

Is that

For image synthesis.

the module 21 maps the optimized vector in the hidden layer space to the multi-channel features to generate a three-dimensional feature tensor, and the three-dimensional feature tensor of each region is mapped according to the freehand sketch

The face image generation system based on sketch still includes:

Claims

1. A face image generation method based on sketch is characterized by comprising the following steps:

2. The sketch-based human face image generation method as claimed in claim 1, wherein the step 1 comprises:

3. The sketch-based human face image generation method as claimed in claim 2, wherein the step 1 comprises:

step 13, constructing a draft image data set s ═ s for training_iTo the training set of the hidden layer, extracting the characteristics of each picture in the training set, and for a certain area c, utilizing the encoder model E^cConstructing a set of hidden layer features for the training composition

step 14, inputting the hand-drawn sketch

Searching and interpolating the characteristic vector of the c-th part

Projected onto manifold space Mc, denoted

To M^c；

step 141, feature vector for region c

Represents a set of K most recent samples to represent

In manifold space M^cAdjacent features of (a);

wherein

Is a sample

step 143, weight of given solution

In manifold space M^cThe projected points on may be represented as:

wherein

Is that

For image synthesis.

4. A sketch-based human face image generation method as claimed in claim 3, wherein the step 2 comprises:

Exact position of the face componentAnd splicing to obtain a complete face feature tensor.

5. The sketch-based face image generating method as claimed in claim 4, further comprising:

6. A sketch-based face image generation system, comprising:

7. A sketch-based human face image generation system as claimed in claim 6, wherein the module 1 comprises:

The module 12 trains the automatic coding network by taking the characteristics of each region as training data to obtain a coder model E^cThe automatic coding network comprises a multi-layer decoder and a multi-layer encoder, and at the encoder and at the decoderA full-concatenation layer is provided in the middle of the encoder, and a residual block is added after each convolution or deconvolution operation of the encoder and decoder to construct the hidden layer descriptor.

8. A sketch-based human face image generation system as claimed in claim 7 wherein the module 1 comprises: