CN112883826B - Face cartoon generation method based on learning geometry and texture style migration - Google Patents

Face cartoon generation method based on learning geometry and texture style migration Download PDF

Info

Publication number
CN112883826B
CN112883826B CN202110118105.7A CN202110118105A CN112883826B CN 112883826 B CN112883826 B CN 112883826B CN 202110118105 A CN202110118105 A CN 202110118105A CN 112883826 B CN112883826 B CN 112883826B
Authority
CN
China
Prior art keywords
cartoon
style
matrix
face
key point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110118105.7A
Other languages
Chinese (zh)
Other versions
CN112883826A (en
Inventor
霍静
刘祥德
***
高阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN202110118105.7A priority Critical patent/CN112883826B/en
Publication of CN112883826A publication Critical patent/CN112883826A/en
Application granted granted Critical
Publication of CN112883826B publication Critical patent/CN112883826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Processing Or Creating Images (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a face cartoon generation method based on learning geometry and texture style migration, which is used for acquiring a face deformation graph based on a geometry deformation module; the texture migration module follows manifold alignment style transfer assumption and migrates a new style based on a neural network optimization mode; for a local similar semantic region of the deformation map and the cartoon, the texture migration module constrains stylized output to have a similar characteristic map with the cartoon style map; and the StyleGAN is used for generating an artistic cartoon augmentation style data set with various styles, and various cartoons are acquired in a hidden space interpolation mode, so that the generation of various face cartoons is realized. The invention can exaggerate the facial features of people and custom acquire the geometric deformation styles of different artists; the geometric distortion and the texture are purposefully rendered, and the cartoon style graph with controllable interpolation is combined, so that the generated facial cartoon image is more vivid and various.

Description

Face cartoon generation method based on learning geometry and texture style migration
Technical Field
The invention belongs to the field of computer application, and particularly relates to a face cartoon generation method based on learning geometry and texture style migration.
Background
The facial cartoon is an artistic form which expresses specific emotion and gives impressive impression by exaggerating the characteristics of tasks, has rich and flexible diversity and is popular with the public. On the one hand, the facial cartoon can have different depiction forms, such as a simple drawing, sketch, oil painting and the like; on the other hand, the facial cartoon can express different emotions through different exaggeration styles. Meanwhile, the cartoon creator also has respective artistic styles and expression modes, so that the diversity of the artistic form of the face cartoon is further increased.
Face cartoons are often created by professionally skilled artists, so that only a few celebrities or the like often have their own cartoons. With the development and popularization of the internet and the mobile internet, more and more common people want to have the cartoon image of themselves, and the creation by professional artists is inconvenient and has high cost. Therefore, the automatic generation of corresponding cartoon images from face photos by computer technology is attracting attention and favor. To produce a realistic caricature, two key issues need to be addressed. One is to perform facial geometry morphing to exaggerate certain key features of a person's face. The other is to synthesize a texture or style similar to a real caricature. There has been a great deal of effort in relation to caricature generation. The traditional automatic generation method of the facial cartoon is mainly divided into two major types, namely a rule-based method and a sample-based method. The method based on the rules adjusts photos through manually preset rules to generate the facial cartoon, for example, the differences between the input facial photos and the average faces are calculated, and the most prominent differences are exaggerated to generate the cartoon; the sample-based method needs to collect a cartoon sample library, then detect the shape information of the five sense organs, the outlines and the like of each input photo, and then search the best matched cartoon five sense organs and outlines from the sample library to form a new cartoon image. It can be seen that these methods are limited in that the exaggerated style is limited by predefined rules and examples. In recent years, with the wide application of deep learning in the field of computer vision, some cross-domain image conversion methods based on deep learning, such as Cycle-GAN and MUNIT, can convert a face photo into a cartoon style, but the cartoon generated by the methods lacks exaggeration in shape; in addition, with the development of generating a countermeasure network (GAN), some students try to generate a cartoon using the GAN. However, a major disadvantage of using GAN to generate a caricature is the large amount of data required for training. Moreover, the resulting caricature is not customizable and modifiable by humans. For example, the Warp-GAN exaggerates the shape of the face by way of Warp on the basis of deep learning-based style conversion, so that the generated cartoon is more realistic in terms of both color and shape. However, the methods can only generate a fixed shape exaggeration style for the same input photo, and cannot meet the requirement of people on the variety of cartoon styles. Generally, the automatic generation of the facial cartoon has the following difficulties: 1) Generating a human face cartoon according to the human face photo, not only requiring the conversion of the color style of the image, but also requiring the exaggeration of the shape of the image according to the characteristics of the input human face, artistic creation style and the like; 2) The artistic form of the facial cartoon has rich diversity, and when the cartoon is created according to the same photo, various different styles can be generated on the color and the shape of the cartoon due to different creation means, different emotions, different styles of artists and the like; 3) Furthermore, for the texture style migration process, the primary limitation is that the generation of caricature images may be limited to the style provided by the dataset.
Disclosure of Invention
The invention aims to: the invention provides a face cartoon generation method based on learning geometry and texture style migration aiming at the task of automatically generating a face cartoon, which uses a pure style migration method to migrate the deformation and texture of the cartoon, allows a user to flexibly customize, and can generate the cartoon with various deformation styles for one face photo.
The technical scheme is as follows: the invention discloses a face cartoon generation method based on learning geometry and texture style migration, which comprises the following steps:
(1) Obtaining key points of faces and cartoon: obtaining key points of a face photo and a cartoon photo through a face key point extraction algorithm;
(2) Building face and cartoon distribution: dividing a pre-acquired facial photo cartoon image data set into a training set and a testing set, and loading a representation matrix of facial photo key point distribution and cartoon key point distribution;
(3) Calculating a projection matrix: acquiring covariance matrixes of key point distribution of the face photo and cartoon key point distribution, acquiring linear transformation capable of enabling the key point distribution and the cartoon key point distribution to be aligned through a WCT algorithm, and marking the linear transformation as a projection matrix;
(4) Obtaining key points of a cartoon domain: projecting key points of the human face to the cartoon domain by utilizing the projection matrix, so as to obtain the key points corresponding to the human face in the cartoon domain;
(5) Acquiring a human face deformation graph: obtaining a Warp affine matrix from a human face to a cartoon according to the human face key points and the cartoon key points, and applying the affine matrix on the whole human face image so as to obtain a human face deformation graph;
(6) Calculating a feature-based neighbor matrix: respectively inputting the human face deformation graph and the cartoon style graph into a VGG-19 network to extract content representation and style representation; calculating cosine similarity of each position feature of the content representation and each position feature of the style representation to obtain a position-to-position neighbor matrix, wherein each element of the matrix encodes the similarity of the features in different spatial positions, and the matching relation of the content to the style semantic level is described;
(7) Semantic-based style loss: for the content representation of the facial deformation graph and the style representation of the cartoon, calculating a Euclidean distance matrix from position to position of the facial deformation graph and the style representation of the cartoon, wherein the Euclidean distance matrix and the neighbor matrix have equal large dimensions; defining K as the number of neighbors, assigning 1 to K positions corresponding to the Euclidean distance matrix according to the position relation of the first K neighbors in the neighbor matrix, and assigning 0 to the rest positions, and finally summing the distance matrix to be used as style loss;
(8) Iterative generation of cartoon pictures: based on a back propagation algorithm, returning a gradient of style loss by using an iterative updating mode, and punishing the characteristics of matching the content to the style, so that the more similar characteristics are more similar and better, the input deformation graph is gradually rendered with the style texture of the cartoon, and a final cartoon picture is obtained;
(9) Feature-based face content structure retention: re-inputting the generated cartoon to VGG-19 to extract stylized content representation, and using the mean square error of the stylized content representation and the deformation graph content representation as content loss, thereby keeping the original face structure of the deformation graph from being damaged by the texture migration process;
(10) Obtaining diversified cartoon styles: generating massive cartoon generation by means of StyleGAN, simultaneously providing the effect of controlling the cartoon generation by a hidden space interpolation method, and constructing cartoon style distribution with rich characteristics;
(11) And deforming the face photos with different inputs by using a projection matrix in a training stage, and rendering textures by using a style migration method based on semantic level.
Further, the number of the key points in the step (1) is 128.
Further, the step (3) is implemented by the following formula:
wherein L is pw For whitening matrices, the mean vector of keypoints is expressed asThe mean vector of the cartoon key points is +.>The key point matrix of the picture after centralization is +.>The key point matrix of the cartoon after centralization is +.>For->Decomposing the characteristic value to obtain D p 、E p 、/>D p For diagonal arrays of diagonal element eigenvalues in the picture domain, E p Orthogonal array with picture domain column vector as feature vector, < >>For E p Is a transpose of (2); for->Decomposing the characteristic value to obtain D c 、E c 、/>D c A diagonal matrix of diagonal element characteristic values of cartoon domain, E c Orthogonal array with cartoon domain array vector as feature vector>For E c Is a transpose of (2); />Is the aligned picture key point matrix, the covariance matrix and +.>Is the same as the covariance matrix of (a); p is the projection matrix from the picture to the caricature.
Further, the key points corresponding to the faces in the cartoon domain in the step (4) are as follows:
wherein P is a projection matrix, l p Is a face key point corresponding to the test picture, and the average value vector of the face key point is expressed asThe mean vector of the cartoon key points is expressed as +.>
Further, the implementation process of the step (7) is as follows:
constructing a graph matrixW l And H l The width and height of the feature map, respectively, l refers to the features of the first layer; wherein each element of the matrix encodes the similarity of the feature at a different spatial location, each element being defined as follows:
wherein,is C l Feature vector at the i-th position, +.>Is S l A feature vector at a j-th position;representation of +.>K nearest neighbors of (a); />Representing +.>K nearest neighbors of (a);
the distance measure used to calculate k neighbors is the cosine distance; to achieve semantically aligned style migration, the following objective function is optimized:
if A l (i, j) =1, the feature of the i-th position of the content map shares the same semantic meaning as the feature of the j-th position of the style map; the objective function of stylization is to force G l And (3) withSimilarly, the final objective function is as follows:
wherein,is defined in the first con Content loss on layer features, alpha and beta are superparameters that balance content loss and style loss.
Further, in the step (10), massive cartoon generation based on StyleGAN is performed by training a generation countermeasure network on a large-scale cartoon data set, so that cartoon style graphs with different styles are generated.
The beneficial effects are that: compared with the prior art, the invention has the beneficial effects that: 1. the deformation style migration method based on the WCT is provided for the first time, and reasonable exaggerated and diversified deformation styles are obtained through the key points of the face and the key points of the cartoon; 2. the semantic grade style migration method based on neural network optimization can obtain cartoon textures with semantic consistency, and the cartoon facial expression can be obtained in an iterative and controllable manner, so that the visual effect of the facial cartoon is more vivid and interesting; 3. the facial features of people can be exaggerated, and geometric deformation styles of different artists can be obtained in a customized mode; 4. the novel cartoon generation method with separable deformation migration and texture migration is provided, so that a model can render geometrical distortion and texture in a targeted manner, and a cartoon style picture with controllable interpolation is combined, so that the generated facial cartoon image is more vivid and various.
Drawings
FIG. 1 is a flow chart of the invention;
FIG. 2 is a schematic diagram of a geometry deforming network according to the present invention;
FIG. 3 is a drawing of an example of a facial cartoon generated using the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings.
The invention provides a face cartoon generation method based on learning geometry and texture style migration, which respectively uses a geometry deformation module and a texture migration module to perform geometry deformation and texture rendering on an image, generates an artistic cartoon augmentation style data set with various styles through StyleGAN, and allows various cartoon to be acquired in a hidden space interpolation mode, thereby realizing diversified face cartoon generation. As shown in fig. 1, the method specifically comprises the following steps:
(1) Obtaining key points of faces and cartoon: and obtaining 128 key points of the face photo and the cartoon photo through a face key point extraction algorithm.
For any photo in the data set, the invention adopts the face key point detection algorithm to extract the key points of the photo, so that the key point pairs from the face domain to the cartoon domain maintain the spatial consistency. Let us now assume that there are two sets of keypoints from the face domain and the caricature domain, respectively.
(2) Building face and cartoon distribution: dividing a training set and a testing set from a pre-acquired facial photo cartoon image data set, and loading a representation matrix of facial photo key point distribution and cartoon key point distribution.
Representing a key point matrix containing a face map asRepresenting a matrix containing caricature keypoints asWherein n is p Representing the number of face photos, n c The number of comics is represented, and d is the number of key point coordinates (key point×2). When there are a sufficient number of photos and comics, the two matrices should be able to delineate the distribution of the photo and comic keypoints.
(3) Calculating a projection matrix: and obtaining covariance matrixes of the key point distribution of the face photo and the key point distribution of the cartoon, obtaining linear transformation capable of enabling the key point distribution and the key point distribution to be aligned through a WCT algorithm, and marking the linear transformation as a projection matrix.
The geometric deformation network is responsible for learning the transformation from the photo face key points to the cartoon face key points, and deforming the image through the Warping operation. The detailed structure of the geometrically deformed network is shown in fig. 2:
given a face map, this module aims to perform a distorted photograph based on geometric deformation. The invention provides a WCT algorithm based on key points. Heretofore, the WCT algorithm was merely a texture for image style migration to obtain a style map. The invention uses the domain knowledge of key point distribution to apply the WCT algorithm in the geometric deformation migration domain for the first time by following the WCT theory idea.
By using WCT, the object of the invention is to find a projection, let L c Covariance matrix and L p Is the same. Thus, the projection can map the photo keypoints to the caricature keypoint space. The detailed process is as follows:
the mean vector of the picture key points is expressed asThe mean vector of the cartoon key points is expressed as +.> The picture key point matrix after centering is expressed as +.>The centered cartoon key point matrix is expressed as +.>The whitening operation first obtains the picture covariance matrix +.>Is described, and feature vectors. For->Decomposing the characteristic value to obtain D p 、E p 、/>By D p To represent a diagonal matrix of diagonal element eigenvalues, using E p To represent an orthogonal array with column vectors as eigenvectors, with +.>To represent E c Is a transpose of (2); obviously there is->The whitening process of the picture key point matrix is as follows:
wherein the whitening matrix L pw Is characterized in that
Similar to the whitening process, singular value decomposition is also used to solve the cartoon key point matrixIs described, and feature vectors. For->Decomposing the characteristic value to obtain D c 、E c 、/>By D c To represent a diagonal matrix of diagonal element eigenvalues, using E c To represent an orthogonal array with column vectors as eigenvectors, with +.>To represent E c Is a transpose of (2); obviously there is->The coloring process is as follows:
is the aligned picture key point matrix, the covariance matrix and +.>Is the same. Can be demonstrated by simple mathematical derivation +.>Satisfy->That is, by applying WCT on the keypoint matrix of the two domains, the algorithm achieves alignment of the two distributions. Finally, let(s)>Add->The invention obtains the cartoon key point L corresponding to the transformed photo pc . Finally, the projection matrix P from the picture to the cartoon is obtained by arrangement:
(4) Obtaining key points of a cartoon domain: and projecting the key points of the human face to the cartoon domain by using the projection matrix, thereby obtaining the key points corresponding to the human face in the cartoon domain.
In the testing stage, the invention uses the projection matrix P to project the key points of the test picture to the cartoon key point space, and the corresponding cartoon key points after transformation are as follows:
wherein, the projection matrix P, l is obtained in the step (3) p Is a face key point corresponding to the test picture, and the average value vector of the face key point is expressed asThe mean vector of the cartoon key points is expressed as +.>
(5) Acquiring a human face deformation graph: and obtaining a Warp affine matrix from the human face to the cartoon according to the human face key points and the cartoon key points, and applying the affine matrix on the whole human face image so as to obtain the human face deformation graph.
The key point l of the cartoon domain obtained in the step (4) pc Key point l of face domain pc And calculating an affine matrix H from the human face domain to the cartoon domain, and deforming the human face photo by using the matrix to obtain a deformation graph.
(6) Calculating a feature-based neighbor matrix: respectively inputting the human face deformation graph and the cartoon style graph into a VGG-19 network to extract content representation and style representation; and calculating cosine similarity of each position feature of the content representation and each position feature of the style representation to obtain a position-to-position neighbor matrix, wherein each element of the matrix encodes the similarity of the features in different spatial positions, and the matching relation of the content to the style semantic level is described.
(7) Semantic-based style loss: for the content representation of the facial deformation graph and the style representation of the cartoon, calculating a Euclidean distance matrix from position to position of the facial deformation graph and the style representation of the cartoon, wherein the Euclidean distance matrix and the neighbor matrix have equal large dimensions; and defining K as the number of neighbors, assigning 1 to K positions corresponding to the Euclidean distance matrix according to the position relation of the first K neighbors in the neighbor matrix, and assigning 0 to the rest positions, and finally summing the distance matrix to obtain the style loss.
For the deformation map generated by the WCT, the next step is to render the caricature on the deformation map. Based on manifold assumptions, the invention provides an optimized semantic alignment style migration network.
Specifically, in the field of style migration, a picture called content map of an overall content structure is provided, and a picture called style map of a stylized texture is provided, so that the style migration is to obtain the stylized texture with high quality as much as possible on the basis of maintaining the original content structure. Here, the deformation map (content map) is denoted as I p Representing a cartoon (stylistic picture) as I c Generating cartoon as I g . The output characteristics of the three images input into the VGG-19 characteristic extraction network are respectivelyWherein D is l Is the number of characteristic channels, W l And H l The length and width of the feature map, respectively.
I g First initialized to I p The invention constructs a graph matrixWherein each element of the matrix encodes the similarity of the feature at a different spatial location, each element being defined as follows:
wherein,is C l Feature vector at the i-th position, +.>Is S l Feature vector at the j-th position.Representation of +.>K nearest neighbors of (a); similarly, let go of>Representing +.>Is not equal to k-nearest neighbor of (a). The distance measure used to calculate k-nearest neighbors is the cosine distance. To achieve semantically aligned style migration, the algorithm needs to optimize the following objective functions:
if A l When (i, j) =1, the feature of the ith position of the content map shares the same semantic meaning as the feature of the jth position of the style map. Therefore, the objective function of stylization is to force G l And (3) withSimilarly. The final objective function is as follows:
wherein,is defined in the first con Content loss on layer features. Alpha and beta are superparameters that balance content loss and style loss.
(8) Iterative generation of cartoon pictures: based on a back propagation algorithm, the gradient of the style loss is returned by using an iterative updating mode, and the characteristics of matching from the content to the style are punished, so that the more similar characteristics are more similar and better, the input deformation graph is gradually rendered with the style texture of the cartoon, and the final cartoon picture is obtained.
(9) Feature-based face content structure retention: and re-inputting the generated cartoon to the VGG-19, extracting stylized content representation, and using the mean square error of the stylized content representation and the deformation graph content representation as content loss, so that the original face structure of the deformation graph is kept from being damaged by the texture migration process.
(10) Obtaining diversified cartoon styles: and generating massive cartoon generation by means of StyleGAN, and simultaneously providing the effect of controlling the cartoon generation by a hidden space interpolation method to construct cartoon style distribution with rich characteristics.
The disadvantages of most style migration methods are: style types are limited to datasets. In order to obtain more style types, the invention can use StyleGAN to generate the fake cartoon, so that the user can use the fake cartoon as the target style cartoon. Specifically, styleGAN is trained with a large cartoon data set, and then a large number of counterfeit cartoons of various styles are obtained. In addition, the user can also utilize hidden space interpolation to generate an interpolation style graph between two fake cartoons so as to generate the intended target cartoons.
(11) And deforming the face photos with different inputs by using a projection matrix in a training stage, and rendering textures by using a style migration method based on semantic level.
And preprocessing the face photo and the cartoon image. Aligning and cutting the human face according to the key points of the human face in the image and adjusting the image to be 512 multiplied by 512 pixels; acquiring key points of a face and a cartoon image by using a face key point detection algorithm; and selecting a facial photo cartoon image data set, and dividing a training set and a testing set. In the training stage, loading a proper amount of facial photo key points and cartoon key points to obtain a centralized facial key point matrix and a centralized cartoon key point matrix; in the geometric deformation module, a covariance matrix corresponding to the face key point matrix of the whitening algorithm is utilized, so that the covariance matrix of the whitened face key point matrix is a diagonal matrix, and then in order to align the whitened face key point covariance matrix with the cartoon key point covariance matrix, a coloring algorithm is used for solving eigenvectors and the diagonal matrix of the cartoon key point covariance matrix, and a projection matrix from the face to the cartoon is obtained through arrangement. In the test stage, for a given face photograph and its keypoints, the projection matrix is used to transform its keypoints into cartoon keypoints. Based on paired face key points and cartoon key points, solving an affine matrix, and finally using the matrix to warp the whole face map to obtain a face deformation map; in the texture migration module, there are two ways to obtain the destination cartoon style map: hidden spatial interpolation from StyleGAN; randomly sampled from the real caricature collection.
In the test stage, for a given cartoon style graph, a neighbor relation matrix of a face deformation graph (content graph) and the cartoon (style graph) is calculated based on manifold hypothesis theory, and the style loss iteration return gradient based on semantic level alignment is used for optimizing the input deformation graph, so that artistic colors or textures of a target cartoon are rendered. In the iterative optimization process, content loss maintained by using the content structure avoids the damage of inherent structural features of the face. After a proper number of iterations, the final generated caricature is output, as shown in fig. 3.

Claims (5)

1. The face cartoon generation method based on learning geometry and texture style migration is characterized by comprising the following steps of:
(1) Obtaining key points of faces and cartoon: obtaining key points of a face photo and a cartoon photo through a face key point extraction algorithm;
(2) Building face and cartoon distribution: dividing a pre-acquired facial photo cartoon image data set into a training set and a testing set, and loading a representation matrix of facial photo key point distribution and cartoon key point distribution;
(3) Calculating a projection matrix: acquiring covariance matrixes of key point distribution of the face photo and cartoon key point distribution, acquiring linear transformation capable of enabling the key point distribution and the cartoon key point distribution to be aligned through a WCT algorithm, and marking the linear transformation as a projection matrix;
(4) Obtaining key points of a cartoon domain: projecting key points of the human face to the cartoon domain by utilizing the projection matrix, so as to obtain the key points corresponding to the human face in the cartoon domain;
(5) Acquiring a human face deformation graph: obtaining a Warp affine matrix from a human face to a cartoon according to the human face key points and the cartoon key points, and applying the affine matrix on the whole human face image so as to obtain a human face deformation graph;
(6) Calculating a feature-based neighbor matrix: respectively inputting the human face deformation graph and the cartoon style graph into a VGG-19 network to extract content representation and style representation; calculating cosine similarity of each position feature of the content representation and each position feature of the style representation to obtain a position-to-position neighbor matrix, wherein each element of the matrix encodes the similarity of the features in different spatial positions, and the matching relation of the content to the style semantic level is described;
(7) Semantic-based style loss: for the content representation of the facial deformation graph and the style representation of the cartoon, calculating a Euclidean distance matrix from position to position of the facial deformation graph and the style representation of the cartoon, wherein the Euclidean distance matrix and the neighbor matrix have equal large dimensions; defining K as the number of neighbors, assigning 1 to K positions corresponding to the Euclidean distance matrix according to the position relation of the first K neighbors in the neighbor matrix, and assigning 0 to the rest positions, and finally summing the distance matrix to be used as style loss;
(8) Iterative generation of cartoon pictures: based on a back propagation algorithm, returning a gradient of style loss by using an iterative updating mode, and punishing the characteristics of matching the content to the style, so that the more similar characteristics are more similar and better, the input deformation graph is gradually rendered with the style texture of the cartoon, and a final cartoon picture is obtained;
(9) Feature-based face content structure retention: re-inputting the generated cartoon to VGG-19 to extract stylized content representation, and using the mean square error of the stylized content representation and the deformation graph content representation as content loss, thereby keeping the original face structure of the deformation graph from being damaged by the texture migration process;
(10) Obtaining diversified cartoon styles: generating massive cartoon generation by means of StyleGAN, simultaneously providing the effect of controlling the cartoon generation by a hidden space interpolation method, and constructing cartoon style distribution with rich characteristics;
(11) Deforming the face photos with different inputs by using a projection matrix in a training stage, and rendering textures by using a style migration method based on semantic level;
the step (3) is realized by the following formula:
wherein L is pw For whitening matrices, the mean vector of keypoints is expressed asThe mean value vector of the cartoon key points isThe key point matrix of the picture after centralization is +.>The key point matrix of the cartoon after centralization is +.>For->Decomposing the characteristic value to obtain D p 、E p 、/>D p For diagonal arrays of diagonal element eigenvalues in the picture domain, E p Orthogonal array with picture domain column vector as feature vector, < >>For E p Is a transpose of (2); for->Decomposing the characteristic value to obtain D c 、E c 、/>D c A diagonal matrix of diagonal element characteristic values of cartoon domain, E c Orthogonal array with cartoon domain array vector as feature vector>For E c Is a transpose of (2); />Is the aligned picture key point matrix, the covariance matrix and +.>Is the same as the covariance matrix of (a); p is the projection matrix from the picture to the caricature.
2. The method for generating facial cartoon based on learning geometry and texture style migration of claim 1 wherein the number of key points in step (1) is 128.
3. The method for generating a facial cartoon based on learning geometry and texture style migration according to claim 1, wherein the key points corresponding to the facial cartoon in the step (4) are:
wherein P is a projection matrix, l p Is a face key point corresponding to the test picture, and the average value vector of the face key point is expressed asThe mean vector of the cartoon key points is expressed as +.>
4. The face cartoon generating method based on learning geometry and texture style migration of claim 1, wherein the implementation process of the step (7) is as follows:
constructing a graph matrixW l And H l The width and height of the feature map, respectively, l refers to the features of the first layer; wherein each element of the matrix encodes the similarity of the feature at a different spatial location, each element being defined as follows:
wherein,is C l Feature vector at the i-th position, +.>Is S l A feature vector at a j-th position; />Representation of +.>K nearest neighbors of (a); />Representing +.>K nearest neighbors of (a); the distance measure used to calculate k neighbors is the cosine distance; to achieve semantically aligned style migration, the following objective function is optimized:
if A l (i, j) =1, the feature of the i-th position of the content map shares the same semantic meaning as the feature of the j-th position of the style map; the objective function of stylization is to force G l And (3) withSimilarly, the final objective function is as follows:
wherein,is defined in the first con Content loss on layer features, alpha and beta are superparameters that balance content loss and style loss.
5. The method according to claim 1, wherein the step (10) of generating massive generated cartoons by means of StyleGAN is to generate cartoons style drawings with different styles by training a generation countermeasure network on a large-scale cartoons data set.
CN202110118105.7A 2021-01-28 2021-01-28 Face cartoon generation method based on learning geometry and texture style migration Active CN112883826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110118105.7A CN112883826B (en) 2021-01-28 2021-01-28 Face cartoon generation method based on learning geometry and texture style migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110118105.7A CN112883826B (en) 2021-01-28 2021-01-28 Face cartoon generation method based on learning geometry and texture style migration

Publications (2)

Publication Number Publication Date
CN112883826A CN112883826A (en) 2021-06-01
CN112883826B true CN112883826B (en) 2024-04-09

Family

ID=76052999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110118105.7A Active CN112883826B (en) 2021-01-28 2021-01-28 Face cartoon generation method based on learning geometry and texture style migration

Country Status (1)

Country Link
CN (1) CN112883826B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113989441B (en) * 2021-11-16 2024-05-24 北京航空航天大学 Automatic three-dimensional cartoon model generation method and system based on single face image
CN114897672A (en) * 2022-05-31 2022-08-12 北京外国语大学 Image cartoon style migration method based on equal deformation constraint
CN115358917B (en) * 2022-07-14 2024-05-07 北京汉仪创新科技股份有限公司 Method, equipment, medium and system for migrating non-aligned faces of hand-painted styles
CN116310008B (en) * 2023-05-11 2023-09-19 深圳大学 Image processing method based on less sample learning and related equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415308A (en) * 2019-06-21 2019-11-05 浙江大学 A kind of human-face cartoon generation method based on cyclic space switching network
CN111508048A (en) * 2020-05-22 2020-08-07 南京大学 Automatic generation method for human face cartoon with interactive arbitrary deformation style
CN111508069A (en) * 2020-05-22 2020-08-07 南京大学 Three-dimensional face reconstruction method based on single hand-drawn sketch
CN112232485A (en) * 2020-10-15 2021-01-15 中科人工智能创新技术研究院(青岛)有限公司 Cartoon style image conversion model training method, image generation method and device
CN112258387A (en) * 2020-10-30 2021-01-22 北京航空航天大学 Image conversion system and method for generating cartoon portrait based on face photo

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10529115B2 (en) * 2017-03-20 2020-01-07 Google Llc Generating cartoon images from photos
US10607065B2 (en) * 2018-05-03 2020-03-31 Adobe Inc. Generation of parameterized avatars

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415308A (en) * 2019-06-21 2019-11-05 浙江大学 A kind of human-face cartoon generation method based on cyclic space switching network
CN111508048A (en) * 2020-05-22 2020-08-07 南京大学 Automatic generation method for human face cartoon with interactive arbitrary deformation style
CN111508069A (en) * 2020-05-22 2020-08-07 南京大学 Three-dimensional face reconstruction method based on single hand-drawn sketch
CN112232485A (en) * 2020-10-15 2021-01-15 中科人工智能创新技术研究院(青岛)有限公司 Cartoon style image conversion model training method, image generation method and device
CN112258387A (en) * 2020-10-30 2021-01-22 北京航空航天大学 Image conversion system and method for generating cartoon portrait based on face photo

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
于谦 ; 高阳 ; 霍静 ; 庄韫恺.视频人脸识别中判别性联合多流形分析.软件学报.2015,全文. *
人脸漫画自动生成算法的研究与应用;王得丘;中国优秀硕士论文数据库;全文 *
基于样本学习的人像线条画生成***;陈洪, 郑南宁, 徐迎庆, 沈向洋;软件学报;20030223(第02期);全文 *
漫画风格的人脸肖像生成算法;阎芳;费广正;柳婷婷;马文慧;石民勇;;计算机辅助设计与图形学学报(第04期);全文 *

Also Published As

Publication number Publication date
CN112883826A (en) 2021-06-01

Similar Documents

Publication Publication Date Title
CN112883826B (en) Face cartoon generation method based on learning geometry and texture style migration
CN111632374B (en) Method and device for processing face of virtual character in game and readable storage medium
Güçlütürk et al. Convolutional sketch inversion
CN106971414B (en) Three-dimensional animation generation method based on deep cycle neural network algorithm
Zhang et al. Facial expression retargeting from human to avatar made easy
Zhong et al. Towards practical sketch-based 3d shape generation: The role of professional sketches
Shim et al. A subspace model-based approach to face relighting under unknown lighting and poses
KR20200052438A (en) Deep learning-based webtoons auto-painting programs and applications
CN104732506A (en) Character picture color style converting method based on face semantic analysis
CN105354248A (en) Gray based distributed image bottom-layer feature identification method and system
Yoo et al. Local color transfer between images using dominant colors
KR20200064591A (en) Webtoons color customizing programs and applications of deep learning
CN107392213B (en) Face portrait synthesis method based on depth map model feature learning
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
CN109325994B (en) Method for enhancing data based on three-dimensional face
CN102013020B (en) Method and system for synthesizing human face image
CN112837210A (en) Multi-form-style face cartoon automatic generation method based on feature image blocks
CN110428404B (en) Artificial intelligence-based auxiliary culture and auxiliary appreciation formulation system
CN110288667A (en) A kind of image texture moving method based on structure guidance
CN114037644B (en) Artistic word image synthesis system and method based on generation countermeasure network
Liu et al. Palette-based recoloring of natural images under different illumination
Yu et al. Deep semantic space guided multi-scale neural style transfer
CN112489218B (en) Single-view three-dimensional reconstruction system and method based on semi-supervised learning
CN114742890A (en) 6D attitude estimation data set migration method based on image content and style decoupling
Way et al. TwinGAN: Twin generative adversarial network for Chinese landscape painting style transfer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant