WO2023005385A1 - Method for training smile image generation model and method for generating smile image - Google Patents

Method for training smile image generation model and method for generating smile image Download PDF

Info

Publication number
WO2023005385A1
WO2023005385A1 PCT/CN2022/094789 CN2022094789W WO2023005385A1 WO 2023005385 A1 WO2023005385 A1 WO 2023005385A1 CN 2022094789 W CN2022094789 W CN 2022094789W WO 2023005385 A1 WO2023005385 A1 WO 2023005385A1
Authority
WO
WIPO (PCT)
Prior art keywords
smile
image
human face
generation model
face sample
Prior art date
Application number
PCT/CN2022/094789
Other languages
French (fr)
Chinese (zh)
Inventor
白须
Original Assignee
北京字跳网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字跳网络技术有限公司 filed Critical 北京字跳网络技术有限公司
Publication of WO2023005385A1 publication Critical patent/WO2023005385A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • Embodiments of the present disclosure relate to the field of computers, and in particular to a method for training a smile image generation model, a method for generating a smile image, a device, an electronic device, a storage medium, a computer program product, and a computer program.
  • facial images can be processed to a certain extent through relevant image processing technologies.
  • a face image of a face under the attribute type of "smile" can be obtained.
  • embodiments of the present disclosure provide a method for training a smile image generation model, a method for generating a smile image, a device, an electronic device, a storage medium, a computer program product, and a computer program.
  • the present disclosure provides a method for training a smile image generation model, including:
  • a plurality of second human face sample images are subjected to the second processing to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images under different smile degrees smile image;
  • an embodiment of the present disclosure provides a method for generating a smile image, including:
  • the trained smile image generation model Inputting the target face image and the expected smile degree into the trained smile image generation model, so that the trained smile image generation model outputs a smile image corresponding to the target face image under the expected smile degree ;
  • the smile image generation model is trained by the training method according to any one of claims 1-8.
  • the present disclosure provides a training device for a smile image generation model, including:
  • An image acquisition module configured to acquire the first human face sample image
  • the first processing module is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
  • the second processing module is configured to use the smile gradient function to perform second processing on a plurality of second human face sample images to obtain a sample smile image group; wherein each of the second human faces is included in the sample smile image group Smile images of sample images at different smile levels;
  • the third processing module is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  • an apparatus for generating a smile image including:
  • An image acquisition module configured to acquire a target face image
  • An image generation module configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image at the expected Smile images by smile level;
  • the smile image generation model is trained by the training method described in any one of the first aspect above.
  • an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or Or, the second aspect and various possibilities of the second aspect relate to the method for generating the smile image.
  • the embodiments of the present disclosure provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above first aspect and the first.
  • the embodiments of the present disclosure provide a computer program product, including computer instructions, which, when executed by a processor, implement the above first aspect and various possible trainings related to the smile image generation model in the first aspect.
  • an embodiment of the present disclosure provides a computer program, which implements the above first aspect and various possible training methods related to the smile image generation model in the first aspect when the computer program is executed by a processor, and/or , the second aspect and various possible second aspects relate to the method for generating the smile image.
  • the method for training a smile image generation model and the method for generating a smile image obtained by the embodiments of the present disclosure obtain a first human face sample image; perform a first process on the first human face sample image to obtain a smile gradient function; wherein, The smile gradient function is used to represent the degree of smile in the face sample image; using the smile gradient function, a second process is performed on a plurality of second face sample images to obtain a sample smile image group; wherein, the sample smile image group includes smile images of each of the second face sample images at different smile levels; using the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
  • FIG. 1 is a schematic diagram of a network architecture based on the present disclosure
  • FIG. 2 is a schematic flowchart of a training method for a smile image generation model provided by an embodiment of the present disclosure
  • FIG. 3 is a schematic diagram of a data flow of a training method for a smile image generation model provided by the present disclosure
  • FIG. 4 is a schematic data flow diagram of a training method for a smile image generation model provided by an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of a method for generating a smile image provided by an embodiment of the present disclosure
  • FIG. 6 is a structural block diagram of a training device for a smile image generation model provided by an embodiment of the present disclosure
  • FIG. 7 is a structural block diagram of a device for generating a smile image provided by an embodiment of the present disclosure.
  • FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
  • the model needs to be trained. Specifically, a large number of face sample images marked with a smile state and a non-smile state can be used to train the above-mentioned generative adversarial neural network model including the StyleGAN architecture, so that the model can be used in the image. The image features of the attribute type of smile are separated and controlled. Then, the face image in the non-smiling state is input into the trained model, and the model can output the corresponding face image in the smiling state.
  • the existing face image processing methods can output a single face image.
  • it can only process the face image in the non-smiling state to obtain the state of laughing.
  • face images but can not obtain more smiling face images, which makes the content of the face images processed by the prior art relatively scarce, the image quality is poor, and the scope of application is limited, which is not conducive to the development of processing technology. Long-term development.
  • the inventors found that the reason why the GAN model cannot generate a gradient smile image is that there are not enough training samples of the smile gradient image, that is, the face image samples of the existing training model generally only include In the laughing state and the hearty laughing state, it takes a lot of time and labor costs to acquire samples of face images with smile gradients.
  • a smile gradient function used to represent the smile degree in the first human face sample image can be constructed first, so as to use the smile gradient function to process the second human face sample image to obtain A sample smile image group of smile images of each of the second face sample images at different smile levels, and then use the sample face images to train the smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree.
  • FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based.
  • the network architecture shown in FIG. 1 may specifically include at least one terminal 1 and a server 2 .
  • the terminal 1 may specifically be a hardware device such as a user's mobile phone, a smart home device, a tablet computer, or a wearable electronic device.
  • the server 2 may specifically be a server or a server cluster set in the cloud.
  • the first face sample image can be used to construct a smile gradient function in advance
  • the smile gradient function can be used to process the second face sample image to obtain a sample smile image group
  • the smile image generation model can be realized through this sample smile image group. training.
  • the trained smile image generation model processes the target face image, generates a smile image with a desired smile level after processing, and presents the smile image to the user.
  • the architecture shown in Figure 1 can be applied to various application APP scenarios that can be used for image processing, such as image special effect processing application APP, application APP with filter shooting function, etc.
  • the method for generating a smile image provided in the present disclosure can be applied to scenarios based on special effects of human face images and the like.
  • the face image special effect refers to the special effect of the face image special effect that is widely used in some video applications.
  • the method for generating the smile image provided by the present disclosure can generate a series of target face images with different smile degrees.
  • a smile image with a gradient smile is presented to provide users with more combinations of special effects and video gameplay.
  • FIG. 2 is a schematic flowchart of a method for training a smile image generation model provided by an embodiment of the present disclosure.
  • the training method of the smile image generation model provided by the embodiment of the present disclosure includes:
  • Step 201 obtaining the first human face sample image
  • Step 202 performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
  • Step 203 Using the smile gradient function, perform a second process on a plurality of second human face sample images to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images in different Smile images by smile level;
  • Step 204 using the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  • the smile image generation model in order to enable the smile image generation model to generate the smile images of the target face image at different smile levels, first, enough sample images of different smile levels are needed to train the smile image generation model. For details, please refer to Step 201 to step 203; then, referring to step 204, based on the obtained sample smile image group, the smile image generation model can be trained to obtain the target face image for processing and output the target face image at the desired smile level Smile image generation model trained on the smile images below.
  • the training device for a smile image generation model first obtains a large number of first human face sample images, wherein, in order to facilitate the construction of a smile gradient function, for the first person
  • first human face sample images some face sample images include faces with smiles, while other face sample images include faces without smiles.
  • first processing including feature extraction and image analysis may be performed on the first human face sample images to construct a smile gradient function.
  • the smile gradient function is used to represent the degree of smile in the face sample image.
  • the variable of the smile gradient function may specifically include the smile range.
  • the degree of the smile can be determined by a number of facial features. For example, the degree of the smile is higher when the exposed teeth are larger; the degree of the smile is higher when the eyes are squinting; the degree of the smile is higher when there are laugh lines around the corners .
  • the degree of smile corresponding to the first human face sample image can be determined according to the factors mentioned above, and then the corresponding smile gradient function can be obtained .
  • the training device of the smile image generation model can use the smile gradient function to perform the second processing including image transformation processing on the plurality of second sample human face images, so that each second sample human face image after processing can be processed according to the smile
  • the variable in the gradient function presents different smile levels, and then a sample smile image group including smile images of each of the second human face sample images at different smile levels is obtained.
  • the sample smile image group in the present disclosure should include as many different smiles as possible Smile images at a certain level to enrich the training samples of the model.
  • the variables in the smile gradient function can be assigned multiple times, and each second sample face image is processed under different variables to obtain each sample face image Smile images at different smile levels.
  • the preset smile image generation model can be trained by using the sample smile image group to obtain the smile image generation model.
  • this implementation also provides a specific method for constructing a smile gradient function:
  • the smile gradient function is used to represent the degree of smile in the face image, it can be specifically understood as the product of the step size (magnitude) of the face image moving along the "smile" direction (normal vector direction).
  • the smile gradient function it is first necessary to determine the direction of the "smile", that is, the direction of the normal vector.
  • the determination of the direction of the normal vector of the smile gradient function may be specifically implemented based on a linear SVM classifier.
  • a linear SVM classifier is a simple and effective form of classifier that can be used to split several features to be classified so that similar features are classified into the same category.
  • the linear SVM classification The filter will determine a hyperplane where features on one side of the hyperplane belong to one class and features on the other side of the hyperplane belong to another class. It is the use of the linear SVM classifier to determine the characteristics of the hyperplane, based on the determined hyperplane and obtain the normal vector of the hyperplane, and use the direction of the normal vector as the normal vector direction of the smile gradient function, and then obtain the corresponding Smile gradient function.
  • the latent variable features can be classified based on the smile type of the first face sample image by controlling the linear SVM classifier, and then the linear classifier model can be trained based on the classification result.
  • hidden variables generally refer to unobservable random variables.
  • the smile type can be understood as an observable variable
  • the latent variable feature can be understood as an unobservable variable.
  • each first human face sample image includes a smile degree value and a latent variable feature.
  • the above-mentioned smile degree value of the first human face sample image specifically, it can be obtained by using the trained smile classification model to classify the faces in each of the first human face sample images of.
  • the smile classification model is a neural network model that can be used to classify and evaluate the degree of smile. Specifically, it can be a classification model based on the resnet-101 network architecture, which can be used to classify and evaluate the input image based on the dimension of smile .
  • the smile classification model can be trained in advance. That is, in an optional implementation manner, obtain smile classification sample images and corresponding smile type annotations; use the smile classification sample images and the smile type annotations to train a pre-built smile classification model, and obtain the trained smile classification model.
  • the smile type annotation is obtained by manually judging "smile” or "not smile” on the smile classification sample image. For example, when the face in the smile classification sample image has a big smile, then the smile of the image The type is marked as "smile”; similarly, when the face in the smile classification sample image has a smile, then the smile type of the image is marked as "smile”; similarly, when the face in the smile classification sample image does not have a smile , then the smile type of the image is marked as "not smiling”.
  • smile classification sample images and corresponding smile type annotations can enable the smile classification model to learn "smile” images, “non-smile” images, and “smile” types in “smile” images, and these types are still It can correspond to different smile degree values. Among them, “laughing” has the highest value of smile degree, while “not smiling” has the lowest value of smile degree.
  • the smile classification model after training can be used to identify the smile type of each first human face sample image, so as to obtain the smile degree value of each first human face sample image.
  • the latent variable features of the first human face sample image are determined when the first human face sample image is generated using a pre-trained model.
  • FIG. 3 is a schematic diagram of a data flow of a training method for a smile image generation model provided by the present disclosure.
  • a pre-training model is preset first, and the pre-training model may include a StyleGAN model, which can generate a large number of first human face sample images.
  • the parameters on which the pre-training model is based will be recorded to form latent variable features of the first human face sample image.
  • Step 2021 according to the smile degree value of each of the first human face sample images, classify the latent variable features corresponding to each of the first human face sample images, and obtain the first type of latent variable feature samples and the second class of latent variable feature samples. Class latent variable feature samples.
  • Step 2022 Train a linear classifier model according to the first type of latent variable feature samples and the second type of latent variable feature samples.
  • Step 2023 obtain the hyperplane output by the trained linear classifier model, and obtain the normal vector direction of the hyperplane.
  • Step 2024 obtain the smile gradient function according to the normal vector.
  • the hyperplane determined by the linear SVM classifier the features on one side of the hyperplane belong to one category, and the features on the other side of the hyperplane belong to another category.
  • the latent variable feature samples can be divided into two categories according to the degree of smile, so that one type of latent variable feature samples can be used as features on one side of the hyperplane, and the other type of hidden variable feature samples can be used as features on the other side of the hyperplane .
  • the latent variable features are classified to obtain the first type of latent variable features used for training the linear SVM classifier samples (positive latent variable feature samples) and the second type of latent variable feature samples (negative latent variable feature samples), so that the trained linear SVM classifier will perform the first type of latent variable feature samples and the second type of latent variable feature samples Accurate classification.
  • the training device will sort the hidden variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain a hidden variable feature sequence; then, according to the latent variable feature sequence A variable feature sequence, classifying each of the first human face sample images to obtain the first type of latent variable feature samples and the second type of latent variable feature samples.
  • the smile degree value for each first human face sample image can be represented by a value between 0 and 10, such as 0 represents not smiling, and 10 represents a wide smile.
  • the first human face sample images can be sorted according to the characterization values corresponding to the smile degree values, and the sorting based on [the first human face sample images, latent variable features, smile types] (for example, by smile type) can be obtained. sort), for example:
  • each hidden variable feature is classified to obtain the first type of hidden variable feature samples (positive latent variable feature samples) and the second type of latent variable feature samples (negative latent variable feature samples).
  • the first type of latent variable feature samples in the above example include: [the first human face sample image 4 , hidden variable feature 4, smile type 10] and [first face sample image 2, hidden variable feature 2, smile type 7].
  • the second type of hidden variable feature samples include: [the first face sample image 1, hidden variable feature 3, smile type 1] and [the first face sample image 3, hidden variable feature 3, Smile Type 0].
  • the training device will use the obtained first type hidden variable feature samples and the second type hidden variable feature samples to train the linear SVM classifier (ie, the linear classifier model), so as to use the obtained linear SVM
  • the normal vector of the hyperplane of the classifier obtains the normal vector direction N, and the product C*N of the normal vector direction N and the step size C in the normal vector direction can constitute a smile gradient function.
  • the linear SVM classifier can be used to determine this property of the hyperplane, which is used to determine the aforementioned smile gradient function.
  • the smile gradient function may be associated with the hyperplane, so that the hyperplane is used to distinguish the feature corresponding to the smile type of "smile” from the feature of the smile type of "not smiling". Then, based on the hyperplane, the normal vector direction N of the hyperplane can be determined respectively, and a certain movement (that is, the step size C) is performed on the basis of the normal vector direction N, and finally the smile gradient function C*N is obtained.
  • the gradient function C*N can be used to represent the magnitude of the image on the feature dimension of "smile", that is, the degree of smile in the face image.
  • the smile gradient function can be used to perform a second process on a plurality of second human face sample images to obtain a sample smile image group.
  • the second sample human face image may include: a human face image obtained by adding noise data to the first human face sample image, or a human face image generated by random generation.
  • the amplitude of each of the second human face sample images in the direction of the normal vector can be moved multiple times with different step lengths to obtain each second human face
  • the smile images of the sample images at different smile levels; the smile images of the second human face sample images at each smile level constitute the sample smile image group.
  • the step size C in the smile gradient function C*N corresponds to different smile degrees, that is, when the step size C changes, the obtained smile images have different amplitudes on the characteristic dimension of "smile", that is, the smile degrees are different.
  • a sample smile image group can be obtained, and the sample smile image group may include smile images of different second human face sample images at different smile levels.
  • the obtained sample smile image group will be used for training the smile image generation model to generate a trained smile image generation model.
  • the smile image generation model may specifically include a conditional generative adversarial network (Conditional Generative Adversarial Network), so that the smile degree of the output smile image is controllable.
  • a conditional generative adversarial network Conditional Generative Adversarial Network
  • each smile image and the corresponding smile degree in the sample smile image group obtained above can be used as input, so that the model can learn the correspondence between the smile degree and the smile image.
  • FIG. 4 is a schematic diagram of a data flow of a training method for a smile image generation model provided by an embodiment of the present disclosure.
  • the smile image generation model in this embodiment includes a generator and a discriminator, where the generator is used to generate images, and the discriminator is used to distinguish the images generated by the generator authenticity.
  • the image generated by the generator will pass the discrimination of the discriminator (that is, the discrimination is true), and the output image will be output.
  • the discriminator of the model will not only determine the authenticity of the output image, but also perform image determination of the degree of smile.
  • the model can perform a mouth discriminator based on the mouth region on the target face image to perform supervisory processing.
  • supervisory processing e.g., a mouth discriminator based on the mouth region on the target face image.
  • the model obtained at this time is the smile image generation model after training.
  • the method for training a smile image generation model includes acquiring a first human face sample image; performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent a human face sample The degree of smile in the image; using the smile gradient function, the second processing is performed on a plurality of second human face sample images to obtain a sample smile image group; the sample smile image group includes each second human face sample image under different smile degrees the smile image; use the sample smile image group to train the preset smile image generation model, and obtain the trained smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
  • FIG. 5 is a schematic flowchart of a method for generating a smile image provided by an embodiment of the present disclosure.
  • the method for generating a smile image provided by an embodiment of the present disclosure includes:
  • Step 501 acquiring a target face image
  • Step 502 Input the target face image and expected smile level into the trained smile image generation model, so that the output of the trained smile image generation model is equal to the target face image at the expected smile level smile image; wherein, the smile image generation model is trained by the training method described in any one of the above implementations.
  • the device for generating smile images can obtain target face images and desired smile levels, and then input the two into the trained smile image generation model, so that the output of the trained smile image generation model is the same as A smile image of the target face image at the desired smile level.
  • the desired degree of smile can be preset by the user; meanwhile, based on different application requirements, the desired degree of smile can be a single degree, such as "laughing heartily", and the desired degree of smile can also be a series of degrees, such as " Don't laugh until you laugh.”
  • the first human face sample image is acquired; the first processing is performed on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the face sample image degree of smile; using the smile gradient function, a plurality of second face sample images are processed for the second time to obtain a sample smile image group; the sample smile image group includes the smiles of each second face sample image under different smile degrees image; using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
  • FIG. 6 is a structural block diagram of a training device for a smile image generation model provided by an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown.
  • the training device for the smile image generation model includes: an image acquisition module 610 , a first processing module 620 , a second processing module 630 , and a third processing module 640 .
  • An image acquisition module 610 configured to acquire the first human face sample image
  • the first processing module 620 is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
  • the second processing module 630 is configured to use the smile gradient function to perform a second process on a plurality of second face sample images to obtain a sample smile image group; wherein each of the second person is included in the sample smile image group Smile images of face sample images at different smile levels;
  • the third processing module 640 is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  • each first human face sample image includes a smile degree value and latent variable features.
  • the first processing module 620 is configured to perform sample classification on the latent variable features corresponding to each of the first human face sample images according to the smile degree value of each of the first human face sample images, to obtain a first type of latent variable feature sample and the second type of hidden variable feature sample; according to the first type of hidden variable feature sample and the second type of hidden variable feature sample, the linear classifier model is trained; the super output of the trained linear classifier model is obtained plane, and obtain the normal vector direction of the hyperplane; obtain the smile gradient function according to the normal vector.
  • the second processing module 630 is configured to move the magnitude of each of the second face sample images in the direction of the normal vector multiple times with different step lengths based on the smile gradient function , to obtain the smile images of each second human face sample image at different smile levels; the smile images of each second human face sample image at each smile level constitute the sample smile image group.
  • the first processing module 620 is configured to sort the latent variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain the latent variable feature Sequence: Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
  • the acquired latent variable features of the first human face sample image are determined when using a pre-trained model to generate the first human face sample image.
  • the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
  • the first processing module 620 is further configured to obtain a smile classification sample image and a corresponding smile type label before obtaining the image data of the first face sample image; use the smile classification sample image and the The smile type is marked, and the pre-built smile classification model is trained to obtain the trained smile classification model.
  • the smile image generation model includes a generative adversarial network, and the generative adversarial network includes a mouth discriminator; correspondingly, the third processing module 640 is configured to use the mouth discriminator to analyze the smile image generation model The output smile image is subjected to supervised processing based on the mouth region.
  • the training device for a smile image generation model acquires a first human face sample image; performs a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent a human face sample The degree of smile in the image; using the smile gradient function, the second processing is performed on a plurality of second human face sample images to obtain a sample smile image group; the sample smile image group includes each second human face sample image under different smile degrees the smile image; use the sample smile image group to train the preset smile image generation model, and obtain the trained smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
  • FIG. 7 is a structural block diagram of a device for generating a smile image provided by an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown.
  • the training device for the smile image generation model includes: an image acquisition module 710 and an image generation module 720 .
  • An image acquisition module 710 configured to acquire a target face image
  • the image generation module 720 is configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image in the An image of a smile at the desired smile level;
  • the smile image generation model is trained by the above-mentioned training method.
  • the device for generating a smile image acquires a first human face sample image; performs a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the face sample image degree of smile; using the smile gradient function, a plurality of second human face sample images are processed for the second time to obtain a sample smile image group; the sample smile image group includes the smiles of each second human face sample image under different smile degrees image; using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model.
  • the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
  • the electronic device provided by the embodiments of the present disclosure can be used to implement the technical solutions of the above method embodiments, and its implementation principles and technical effects are similar, so the embodiments of the present disclosure will not repeat them here.
  • the electronic device 900 may be a terminal device or a media library.
  • the terminal equipment may include but not limited to mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, referred to as PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), mobile terminals such as wearable electronic devices, and fixed terminals such as digital TVs, desktop computers, and smart home devices.
  • PDA Personal Digital Assistant
  • PMP portable multimedia players
  • vehicle-mounted terminals such as vehicle-mounted navigation terminals
  • mobile terminals such as wearable electronic devices
  • fixed terminals such as digital TVs, desktop computers, and smart home devices.
  • the electronic device shown in FIG. 8 is only an embodiment, and should not limit the functions and application scope of the embodiments of the present disclosure.
  • an electronic device 900 may include a processor 901 (such as a central processing unit, a graphics processing unit, etc.) for performing the above methods, which may be stored in a read-only memory (Read Only Memory, ROM for short) 902 Various appropriate actions and processes are executed by the program in the memory or the program loaded from the storage device 908 into the random access memory (Random Access Memory, RAM for short) 903 . In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored.
  • the processor 901, ROM 902, and RAM 903 are connected to each other through a bus 904.
  • An input/output (I/O) interface 905 is also connected to the bus 904 .
  • an input device 906 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; ), a speaker, a vibrator, etc.
  • a storage device 908 including, for example, a magnetic tape, a hard disk, etc.
  • the communication means 909 may allow the electronic device 900 to perform wireless or wired communication with other devices to exchange data. While FIG. 8 shows electronic device 900 having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
  • an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes instructions for executing the methods shown in the flow charts described in the embodiments of the present disclosure. code.
  • the computer program may be downloaded and installed from a network via communication means 909, or from storage means 908, or from ROM 902.
  • the processor 901 When the computer program is executed by the processor 901, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • a computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • Computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Electrically Programmable Read Only Memory (EPROM for short), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc-Read Only Memory, CD-ROM for short), optical storage device, magnetic storage device, or the above any suitable combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device .
  • the program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF for short), etc., or any suitable combination of the above.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to execute the methods shown in the above-mentioned embodiments.
  • Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or media library.
  • the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (connected via the Internet, eg, using an Internet service provider).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
  • exemplary types of hardware logic components include: Field Programmable Gate Array (Field Programmable Gate Array, FPGA for short), Application Specific Integrated Circuit (ASIC for short), Application Specific Standard Products ( Application Specific Standard Product (ASSP for short), System On Chip (SOC for short), Complex Programmable Logic Device (CPLD for short), etc.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), flash memory, fiber optics, compact disc read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory fiber optics
  • CD-ROM compact disc read only memory
  • optical storage devices magnetic storage devices, or any suitable combination of the foregoing.
  • a method for training a smile image generation model includes:
  • a plurality of second human face sample images are subjected to the second processing to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images under different smile degrees smile image;
  • each first human face sample image includes a smile degree value and a latent variable feature
  • the first processing is performed on the first human face sample image to obtain a smile gradient function, including:
  • the latent variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of latent variable feature samples ;
  • the second processing is performed on a plurality of second human face sample images by using the smile gradient function to obtain a sample smile image group, including:
  • the amplitude of each of the second human face sample images in the direction of the normal vector is moved multiple times with different step lengths to obtain each second human face sample image at different smile levels Smile image below;
  • the smile images of the second human face sample images at different smile levels constitute the sample smile image group.
  • the hidden variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of hidden variable feature samples.
  • Hidden variable feature samples including:
  • the hidden variable features of each of the first human face sample images are sorted to obtain a hidden variable feature sequence
  • said acquiring the latent variable features of the first human face sample image is determined when the first human face sample image is generated using a pre-trained model.
  • the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
  • the image data of the first human face sample image it also includes:
  • the pre-built smile classification model is trained by using the smile classification sample image and the smile type label to obtain the trained smile classification model.
  • the smile image generation model includes a generation adversarial network, and the generation adversarial network includes a mouth discriminator;
  • the methods include:
  • the mouth region-based supervision process is performed on the smile image output by the smile image generation model through the mouth discriminator.
  • a method for generating a smile image includes:
  • the trained smile image generation model Inputting the target face image and the expected smile degree into the trained smile image generation model, so that the trained smile image generation model outputs a smile image corresponding to the target face image under the expected smile degree ;
  • the smile image generation model is trained by the training method described in any one of the above first aspects.
  • a training device for a smile image generation model includes:
  • An image acquisition module configured to acquire the first human face sample image
  • the first processing module is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
  • the second processing module is configured to use the smile gradient function to perform second processing on a plurality of second human face sample images to obtain a sample smile image group; wherein each of the second human faces is included in the sample smile image group Smile images of sample images at different smile levels;
  • the third processing module is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  • each first human face sample image includes a smile degree value and a latent variable feature
  • the first processing module is specifically configured to classify the hidden variable features corresponding to each of the first human face sample images according to the smile degree value of each of the first human face sample images, and obtain the first type of latent variable feature samples and the second type of hidden variable feature samples; according to the first type of hidden variable feature samples and the second type of hidden variable feature samples, the linear classifier model is trained; the hyperplane output by the trained linear classifier model is obtained, Obtain the direction of the normal vector of the hyperplane; obtain the gradient function of the smile according to the normal vector.
  • the second processing module is specifically configured to, based on the smile gradient function, move the amplitude of each of the second face sample images in the direction of the normal vector multiple times with different step lengths , to obtain the smile images of each second human face sample image at different smile levels; the smile images of each second human face sample image at each smile level constitute the sample smile image group.
  • the first processing module is specifically configured to sort the hidden variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain the latent variable features Sequence: Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
  • the acquired latent variable features of the first human face sample image are determined when using a pre-trained model to generate the first human face sample image.
  • the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
  • the first processing module is further configured to obtain a smile classification sample image and corresponding smile type labels before obtaining the image data of the first human face sample image; using the smile classification sample image and the The smile type is marked, and the pre-built smile classification model is trained to obtain the trained smile classification model.
  • the smile image generation model includes a generation adversarial network, and the generation adversarial network includes a mouth discriminator;
  • the third processing module is specifically configured to perform supervisory processing based on the mouth area on the smile image output by the smile image generation model through the mouth discriminator.
  • an apparatus for generating a smile image including:
  • An image acquisition module configured to acquire a target face image
  • An image generation module configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image at the expected Smile images by smile level;
  • the smile image generation model is trained by the above-mentioned training method.
  • an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
  • the memory stores computer-executable instructions
  • the at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or Or, the second aspect and various possibilities of the second aspect relate to the method for generating the smile image.
  • the embodiments of the present disclosure provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above first aspect and the first.
  • the embodiments of the present disclosure provide a computer program product, including computer instructions, which, when executed by a processor, implement the above first aspect and various possible trainings related to the smile image generation model in the first aspect.
  • an embodiment of the present disclosure provides a computer program.
  • the computer program When the computer program is executed by a processor, the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or, The second aspect and various possibilities of the second aspect relate to the method for generating the smile image.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

Embodiments of the present disclosure provide a method for training a smile image generation model and a method for generating a smile image, the training method comprising: acquiring a first face sample image; performing first processing on the first face sample image to obtain a smile gradient function, the smile gradient function being used to represent the degree of smiling in the face sample image; performing second processing on multiple second face sample images by using the smile gradient function to obtain a sample smile image group, the sample smile image group comprising smile images of each second face sample image under different degrees of smiling; and training a preset smile image generation model by using the sample smile image group to obtain a trained smile image generation model. A target face image can be processed by using the trained smile image generation model so as to generate a smile image that is consistent with a desired degree of smiling. By means of the foregoing, a smile image of a face under the desired degree of smiling can be obtained, providing more experiences to users.

Description

笑容图像生成模型的训练方法、笑容图像的生成方法Method for training smile image generation model, method for generating smile image
相关申请的交叉引用Cross References to Related Applications
本公开要求于2021年07月28日提交的申请号为CN202110858192.X、名称为“笑容图像生成模型的训练方法、笑容图像的生成方法”的中国专利申请的优先权,此申请的内容通过引用并入本文。This disclosure claims the priority of the Chinese patent application with the application number CN202110858192.X and the title "training method of smile image generation model, generation method of smile image" filed on July 28, 2021, the contents of which are incorporated by reference Incorporated into this article.
技术领域technical field
本公开实施例涉及计算机领域,尤其涉及一种笑容图像生成模型的训练方法、笑容图像的生成方法、装置、电子设备、存储介质、计算机程序产品及计算机程序。Embodiments of the present disclosure relate to the field of computers, and in particular to a method for training a smile image generation model, a method for generating a smile image, a device, an electronic device, a storage medium, a computer program product, and a computer program.
背景技术Background technique
随着科技的进步以及娱乐生活的丰富,一些应用APP中,可通过相关的图像处理技术,对人脸图像进行一定处理。现有的方式中,可获得人脸在“笑容”这一属性类型下的人脸图像。With the advancement of science and technology and the enrichment of entertainment life, in some applications, facial images can be processed to a certain extent through relevant image processing technologies. In the existing method, a face image of a face under the attribute type of "smile" can be obtained.
但是,现有处理方式是无法获得全不笑状态和开怀大笑状态之间的各种渐变的笑容图像的,这就使得现有技术所处理得到的笑容图像的适用范围受限。However, the existing processing methods cannot obtain smile images with various gradients between the no-smile state and the full-smile state, which limits the scope of application of the smile images processed by the prior art.
发明内容Contents of the invention
针对上述问题,本公开实施例拱了一种笑容图像生成模型的训练方法、笑容图像的生成方法、装置、电子设备、存储介质、计算机程序产品及计算机程序。In view of the above problems, embodiments of the present disclosure provide a method for training a smile image generation model, a method for generating a smile image, a device, an electronic device, a storage medium, a computer program product, and a computer program.
第一方面,本公开提供了一种笑容图像生成模型的训练方法,包括:In a first aspect, the present disclosure provides a method for training a smile image generation model, including:
获取第一人脸样本图像;Obtain the first face sample image;
对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;Performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;Using the smile gradient function, a plurality of second human face sample images are subjected to the second processing to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images under different smile degrees smile image;
利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。Using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model.
第二方面,本公开实施例提供了一种笑容图像的生成方法,包括:In a second aspect, an embodiment of the present disclosure provides a method for generating a smile image, including:
获取目标人脸图像;Obtain the target face image;
将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;Inputting the target face image and the expected smile degree into the trained smile image generation model, so that the trained smile image generation model outputs a smile image corresponding to the target face image under the expected smile degree ;
其中,所述笑容图像生成模型通过如上述权利要求1-8中任一项权利要求所述的训练方法训练得到。Wherein, the smile image generation model is trained by the training method according to any one of claims 1-8.
第三方面,本公开提供了一种笑容图像生成模型的训练装置,包括:In a third aspect, the present disclosure provides a training device for a smile image generation model, including:
图像获取模块,用于获取第一人脸样本图像;An image acquisition module, configured to acquire the first human face sample image;
第一处理模块,用于对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;The first processing module is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
第二处理模块,用于利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;The second processing module is configured to use the smile gradient function to perform second processing on a plurality of second human face sample images to obtain a sample smile image group; wherein each of the second human faces is included in the sample smile image group Smile images of sample images at different smile levels;
第三处理模块,用于利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。The third processing module is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
第四方面,本公开实施例提供了一种笑容图像的生成装置,包括:In a fourth aspect, an embodiment of the present disclosure provides an apparatus for generating a smile image, including:
图像获取模块,用于获取目标人脸图像;An image acquisition module, configured to acquire a target face image;
图像生成模块,用于将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;An image generation module, configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image at the expected Smile images by smile level;
其中,所述笑容图像生成模型通过如上述第一方面任一项所述的训练方法训练得到。Wherein, the smile image generation model is trained by the training method described in any one of the first aspect above.
第五方面,本公开实施例提供一种电子设备,包括:至少一个处理器和存储器;In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or Or, the second aspect and various possibilities of the second aspect relate to the method for generating the smile image.
第六方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In the sixth aspect, the embodiments of the present disclosure provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above first aspect and the first The various possible training methods of the smile image generation model in the aspect, and/or, the second aspect and the various possible methods of the smile image generation in the second aspect.
第七方面,本公开实施例提供一种计算机程序产品,包括计算机指令,该计算机指令被处理器执行时实现如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In the seventh aspect, the embodiments of the present disclosure provide a computer program product, including computer instructions, which, when executed by a processor, implement the above first aspect and various possible trainings related to the smile image generation model in the first aspect. The method, and/or, the second aspect and various possible methods of generating the smile image related to the second aspect.
第八方面,本公开实施例提供一种计算机程序,该计算机程序被处理器执行时实现如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In an eighth aspect, an embodiment of the present disclosure provides a computer program, which implements the above first aspect and various possible training methods related to the smile image generation model in the first aspect when the computer program is executed by a processor, and/or , the second aspect and various possible second aspects relate to the method for generating the smile image.
本公开实施例提供的笑容图像生成模型的训练方法、笑容图像的生成方法,由于获取第一人脸样本图像;对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸在期望笑容程度下的笑容图像,为用户带来更多的体验。The method for training a smile image generation model and the method for generating a smile image provided by the embodiments of the present disclosure obtain a first human face sample image; perform a first process on the first human face sample image to obtain a smile gradient function; wherein, The smile gradient function is used to represent the degree of smile in the face sample image; using the smile gradient function, a second process is performed on a plurality of second face sample images to obtain a sample smile image group; wherein, the sample smile image group includes smile images of each of the second face sample images at different smile levels; using the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
附图说明Description of drawings
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present disclosure. Those skilled in the art can also obtain other drawings based on these drawings without any creative effort.
图1为本公开所基于的一种网络架构的示意图;FIG. 1 is a schematic diagram of a network architecture based on the present disclosure;
图2为本公开实施例提供的一种笑容图像生成模型的训练方法的流程示意图;FIG. 2 is a schematic flowchart of a training method for a smile image generation model provided by an embodiment of the present disclosure;
图3为本公开提供的一种笑容图像生成模型的训练方法的数据流示意图;FIG. 3 is a schematic diagram of a data flow of a training method for a smile image generation model provided by the present disclosure;
图4为本公开实施例提供的一种笑容图像生成模型的训练方法的数据流示意图;FIG. 4 is a schematic data flow diagram of a training method for a smile image generation model provided by an embodiment of the present disclosure;
图5为本公开实施例提供的一种笑容图像的生成方法的流程示意图;FIG. 5 is a schematic flowchart of a method for generating a smile image provided by an embodiment of the present disclosure;
图6为本公开实施例提供的笑容图像生成模型的训练装置的结构框图;FIG. 6 is a structural block diagram of a training device for a smile image generation model provided by an embodiment of the present disclosure;
图7为本公开实施例提供的笑容图像的生成装置的结构框图;FIG. 7 is a structural block diagram of a device for generating a smile image provided by an embodiment of the present disclosure;
图8为本公开实施例提供的电子设备的硬件结构示意图。FIG. 8 is a schematic diagram of a hardware structure of an electronic device provided by an embodiment of the present disclosure.
具体实施方式Detailed ways
为使本公开实施例的目的、技术方案和优点更加清楚,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below in conjunction with the drawings in the embodiments of the present disclosure. Obviously, the described embodiments It is a part of the embodiments of the present disclosure, but not all of them. Based on the embodiments in the present disclosure, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present disclosure.
一般来说,现有的对图像处理的实现是基于机器学习技术实现的。通过对包括StyleGAN架构的生成对抗神经网络模型进行训练,可以生成人脸图像在不同属性类型下的图像。Generally speaking, existing implementations of image processing are based on machine learning techniques. By training the generative adversarial neural network model including the StyleGAN architecture, images of face images under different attribute types can be generated.
以笑容这一属性类型进行示例,为了使得模型可用于将输入的完全不笑状态的人脸图像,处理为输出的开怀大笑状态的人脸图像,一般会采用如下步骤:Taking the attribute type of smile as an example, in order to make the model can be used to process the input face image in the state of not smiling at all into the output face image in the state of smiling, the following steps are generally taken:
首先,需要对模型进行训练,具体可利用大量的标记有开怀大笑状态和不笑状态的人脸样本图像对上述的包括StyleGAN架构的生成对抗神经网络模型进行训练,以使模型可对图像中的笑容这一属性类型的图像特征进行分离和控制。然后,将不笑状态的人脸图像输入至训练完毕的模型中,模型能够输出开怀大笑状态的相应的人脸图像。First of all, the model needs to be trained. Specifically, a large number of face sample images marked with a smile state and a non-smile state can be used to train the above-mentioned generative adversarial neural network model including the StyleGAN architecture, so that the model can be used in the image. The image features of the attribute type of smile are separated and controlled. Then, the face image in the non-smiling state is input into the trained model, and the model can output the corresponding face image in the smiling state.
但是,现有的人脸图像的处理方式能够输出的人脸图像较为单一,如对于笑容这一属性类型来说,其仅能通过对不笑状态的人脸图像进行处理,得到开怀大笑状态的人脸图像,而并不能获得更多笑容程度的人脸图像,这就使得现有技术所处理得到的人脸图像的内容较为匮乏,图像质量差,适用范围受限,不利于处理技术的长期发展。However, the existing face image processing methods can output a single face image. For example, for the attribute type of smile, it can only process the face image in the non-smiling state to obtain the state of laughing. face images, but can not obtain more smiling face images, which makes the content of the face images processed by the prior art relatively scarce, the image quality is poor, and the scope of application is limited, which is not conducive to the development of processing technology. Long-term development.
针对这样的问题,发明人发现,生成对抗网络模型不能生成渐变的笑容图像的原因在于缺少足够的笑容渐变图像的训练样本,即,现有的训练模型的人脸图像的样本一般仅包括有不笑状态和开怀大笑状态,对于具有笑容渐变的人脸图像的样本的获取则需要大量的时间和人力成本。In response to such a problem, the inventors found that the reason why the GAN model cannot generate a gradient smile image is that there are not enough training samples of the smile gradient image, that is, the face image samples of the existing training model generally only include In the laughing state and the hearty laughing state, it takes a lot of time and labor costs to acquire samples of face images with smile gradients.
在这种情况下,发明人首先想到,可先构建用于表示第一人脸样本图像中的笑容程度的笑容渐变函数,以利用该笑容渐变函数对第二人脸样本图像进行处理,得到包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像的样本笑容图像组,进而使用样本人脸图像对笑容图像生成模型进行训练。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸在期望笑容程度下的笑容图像,获取笑容图像的时间成本和人力成本相对现有技术来说均较低,并且,由于可生成不同笑容程度下的笑容图像,这些图像也能够有效提升人脸图像的适用范围,为用户带来更多的体验。In this case, the inventor first thought that a smile gradient function used to represent the smile degree in the first human face sample image can be constructed first, so as to use the smile gradient function to process the second human face sample image to obtain A sample smile image group of smile images of each of the second face sample images at different smile levels, and then use the sample face images to train the smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, it is possible to obtain smile images of human faces at desired smile levels, and the time cost and labor cost for obtaining smile images are relatively low compared to the prior art, and, since smile images at different smile levels can be generated, these Images can also effectively improve the scope of application of face images and bring more experience to users.
参考图1,图1为本公开所基于的一种网络架构的示意图,该图1所示网络架构具体可包括至少一个终端1以及服务器2。Referring to FIG. 1 , FIG. 1 is a schematic diagram of a network architecture on which the present disclosure is based. The network architecture shown in FIG. 1 may specifically include at least one terminal 1 and a server 2 .
其中,终端1具体可为用户手机、智能家居设备、平板电脑、可穿戴电子设备等硬件设备。 服务器2可具体为设置在云端的服务器或者服务器集群。Wherein, the terminal 1 may specifically be a hardware device such as a user's mobile phone, a smart home device, a tablet computer, or a wearable electronic device. The server 2 may specifically be a server or a server cluster set in the cloud.
其中,可预先利用第一人脸样本图像构建笑容渐变函数,并利用该笑容渐变函数对第二人脸样本图像进行处理得到样本笑容图像组,通过该样本笑容图像组可实现对于笑容图像生成模型的训练。Among them, the first face sample image can be used to construct a smile gradient function in advance, and the smile gradient function can be used to process the second face sample image to obtain a sample smile image group, and the smile image generation model can be realized through this sample smile image group. training.
训练完毕的笑容图像生成模型对目标人脸图像进行处理,生成处理后期望笑容程度下的笑容图像,并将该笑容图像呈现给用户。The trained smile image generation model processes the target face image, generates a smile image with a desired smile level after processing, and presents the smile image to the user.
图1所示架构可适用于可用于图像处理的各类应用APP的场景中,例如图像特效处理应用APP、具有滤镜拍摄功能的应用APP等。The architecture shown in Figure 1 can be applied to various application APP scenarios that can be used for image processing, such as image special effect processing application APP, application APP with filter shooting function, etc.
具体的,本公开提供的笑容图像的生成方法可应用于基于人脸图像特效等场景。Specifically, the method for generating a smile image provided in the present disclosure can be applied to scenarios based on special effects of human face images and the like.
其中,人脸图像特效是指广泛应用于一些视频类应用的人脸图像特效的特效,通过本公开提供的笑容图像的生成方法可将目标人脸图像,生成一系列的具有不同笑容程度的,呈现有渐变笑容的笑容图像,以向用户提供更多的特效组合和视频玩法。Among them, the face image special effect refers to the special effect of the face image special effect that is widely used in some video applications. The method for generating the smile image provided by the present disclosure can generate a series of target face images with different smile degrees. A smile image with a gradient smile is presented to provide users with more combinations of special effects and video gameplay.
下面将针对本公开提供的方案进行进一步说明:The scheme provided by the present disclosure will be further described below:
第一方面,图2为本公开实施例提供的一种笑容图像生成模型的训练方法的流程示意图。参考图2,本公开实施例提供的笑容图像生成模型的训练方法,包括:In the first aspect, FIG. 2 is a schematic flowchart of a method for training a smile image generation model provided by an embodiment of the present disclosure. Referring to FIG. 2 , the training method of the smile image generation model provided by the embodiment of the present disclosure includes:
步骤201、获取第一人脸样本图像; Step 201, obtaining the first human face sample image;
步骤202、对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度; Step 202, performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
步骤203、利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;Step 203: Using the smile gradient function, perform a second process on a plurality of second human face sample images to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images in different Smile images by smile level;
步骤204、利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。 Step 204, using the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
如前所述的,为了能够使得笑容图像生成模型能够生成目标人脸图像在不同笑容程度下的笑容图像,首先需要足够的不同笑容程度的样本图像对该笑容图像生成模型进行训练,具体可参考步骤201至步骤203;然后,参考步骤204,可基于得到的样本笑容图像组,对笑容图像生成模型进行训练,以得到用于对目标人脸图像进行处理并输出目标人脸图像在期望笑容程度下的笑容图像的训练后的笑容图像生成模型。As mentioned above, in order to enable the smile image generation model to generate the smile images of the target face image at different smile levels, first, enough sample images of different smile levels are needed to train the smile image generation model. For details, please refer to Step 201 to step 203; then, referring to step 204, based on the obtained sample smile image group, the smile image generation model can be trained to obtain the target face image for processing and output the target face image at the desired smile level Smile image generation model trained on the smile images below.
具体来说,在本公开提供的笑容图像生成模型的训练方法中,笑容图像生成模型的训练装置首先会获取大量的第一人脸样本图像,其中,为了便于构建笑容渐变函数,对于第一人脸样本图像来说,一些人脸样本图像中包括具有笑容的人脸,而另一些人脸样本图像包括不具有笑容的人脸。Specifically, in the method for training a smile image generation model provided in the present disclosure, the training device for a smile image generation model first obtains a large number of first human face sample images, wherein, in order to facilitate the construction of a smile gradient function, for the first person For face sample images, some face sample images include faces with smiles, while other face sample images include faces without smiles.
当获取到这些第一人脸样本图像之后,可以对第一人脸样本图像进行包括特征提取和图像分析在内的第一处理,以构建笑容渐变函数。After the first human face sample images are acquired, first processing including feature extraction and image analysis may be performed on the first human face sample images to construct a smile gradient function.
可知的是,该笑容渐变函数用于表示人脸样本图像中的笑容程度。其中,笑容渐变函数的变量具体可包括笑容幅度。It can be known that the smile gradient function is used to represent the degree of smile in the face sample image. Wherein, the variable of the smile gradient function may specifically include the smile range.
其中的笑容程度具体可由多个面部特征因素确定,如当牙齿裸露的面积较大时,笑容程度较高;当眼睛眯眼时,笑容程度较高;当眼角区域出现笑纹时,笑容程度较高。通过对第一人脸样本图像进行包括特征提取和图像分析在内的第一处理,可以使得可根据上述提及的因素确定第一人脸样本图像对应的笑容程度,进而得到相应的笑容渐变函数。The degree of the smile can be determined by a number of facial features. For example, the degree of the smile is higher when the exposed teeth are larger; the degree of the smile is higher when the eyes are squinting; the degree of the smile is higher when there are laugh lines around the corners . By performing the first processing including feature extraction and image analysis on the first human face sample image, the degree of smile corresponding to the first human face sample image can be determined according to the factors mentioned above, and then the corresponding smile gradient function can be obtained .
然后,笑容图像生成模型的训练装置可利用笑容渐变函数对多个第二样本人脸图像进行包括图像变换处理在内的第二处理,以使经处理后的各个第二样本人脸图像根据笑容渐变函数中的变量,呈现不同的笑容程度,进而得到包括有每个所述第二人脸样本图像在不同笑容程度下的笑容图像的样本笑容图像组。Then, the training device of the smile image generation model can use the smile gradient function to perform the second processing including image transformation processing on the plurality of second sample human face images, so that each second sample human face image after processing can be processed according to the smile The variable in the gradient function presents different smile levels, and then a sample smile image group including smile images of each of the second human face sample images at different smile levels is obtained.
其中,由于生成的样本笑容图像组中的笑容图像是用于对笑容图像生成模型进行训练的,因此,为了保证模型训练效果,本公开中的样本笑容图像组中应包括尽可能多的不同笑容程度下的笑容图像,以使得模型训练样本丰富化。Among them, since the smile images in the generated sample smile image group are used to train the smile image generation model, in order to ensure the model training effect, the sample smile image group in the present disclosure should include as many different smiles as possible Smile images at a certain level to enrich the training samples of the model.
基于此,在生成样本笑容图像组时,可对笑容渐变函数中的变量进行多次赋值,并对每一个第二样本人脸图像进行在不同变量下的处理,以得到每一个样本人脸图像在不同笑容程度下的笑容图像。Based on this, when generating the sample smile image group, the variables in the smile gradient function can be assigned multiple times, and each second sample face image is processed under different variables to obtain each sample face image Smile images at different smile levels.
最后,当获得样本笑容图像组之后,可利用样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到笑容图像生成模型。Finally, after the sample smile image group is obtained, the preset smile image generation model can be trained by using the sample smile image group to obtain the smile image generation model.
在可选实施方式中,本实施方式还提供了一种笑容渐变函数的具体构建方法:In an optional implementation, this implementation also provides a specific method for constructing a smile gradient function:
由于笑容渐变函数是用于表示人脸图像中的笑容程度的,其具体可理解为人脸图像沿“笑”的方向(法向量方向)移动的步长(幅度)之积。为了得到该笑容渐变函数,首先需要确定其中的“笑”的方向,即法向量方向。在本公开的可选实施方式中,该笑容渐变函数的法向量方向的确定具体可基于线性SVM分类器实现。Since the smile gradient function is used to represent the degree of smile in the face image, it can be specifically understood as the product of the step size (magnitude) of the face image moving along the "smile" direction (normal vector direction). In order to obtain the smile gradient function, it is first necessary to determine the direction of the "smile", that is, the direction of the normal vector. In an optional implementation manner of the present disclosure, the determination of the direction of the normal vector of the smile gradient function may be specifically implemented based on a linear SVM classifier.
简单来说,线性SVM分类器是一种简单且有效的分类器形式,其可用于对若干待分类的特征进行分割以使类似的特征被归为相同类别,在这一过程中,线性SVM分类器将确定一超平面,在超平面一侧的特征属于一类别,而在超平面另一侧的特征属于另一类别。正是利用线性SVM分类器可确定超平面的特性,以基于其确定出的超平面并得到超平面的法向量,并将该法向量的方向作为笑容渐变函数的法向量方向,进而得到相应的笑容渐变函数。In simple terms, a linear SVM classifier is a simple and effective form of classifier that can be used to split several features to be classified so that similar features are classified into the same category. In this process, the linear SVM classification The filter will determine a hyperplane where features on one side of the hyperplane belong to one class and features on the other side of the hyperplane belong to another class. It is the use of the linear SVM classifier to determine the characteristics of the hyperplane, based on the determined hyperplane and obtain the normal vector of the hyperplane, and use the direction of the normal vector as the normal vector direction of the smile gradient function, and then obtain the corresponding Smile gradient function.
基于该原理,在本实施方式中,可通过控制线性SVM分类器可基于第一人脸样本图像的笑容类型先对隐变量特征进行分类,然后再基于其分类结果对线性分类器模型进行训练。Based on this principle, in this embodiment, the latent variable features can be classified based on the smile type of the first face sample image by controlling the linear SVM classifier, and then the linear classifier model can be trained based on the classification result.
其中,隐变量一般是指不可观测的随机变量。在本实施方式中,对于第一人脸样本图像来说,笑容类型可理解为可观测的变量,而隐变量特征可理解为不可观测的变量。Among them, hidden variables generally refer to unobservable random variables. In this embodiment, for the first human face sample image, the smile type can be understood as an observable variable, and the latent variable feature can be understood as an unobservable variable.
也就是说,其中,上述的第一人脸样本图像的数量为多个,每个第一人脸样本图像中包括笑容程度值和隐变量特征。That is to say, there are multiple first human face sample images, and each first human face sample image includes a smile degree value and a latent variable feature.
其中在可选实施方式中,对于上述的第一人脸样本图像的笑容程度值,具体可利用训练后的笑容分类模型对所述各第一人脸样本图像中的人脸进行笑容分类而获得的。Wherein in an optional implementation manner, for the above-mentioned smile degree value of the first human face sample image, specifically, it can be obtained by using the trained smile classification model to classify the faces in each of the first human face sample images of.
笑容分类模型是一种可用于对笑容程度进行分类和评估的神经网络模型,其具体可为基于resnet-101网络架构的分类模型,其可用于对输入的图像基于笑容这一维度进行分类和评估。The smile classification model is a neural network model that can be used to classify and evaluate the degree of smile. Specifically, it can be a classification model based on the resnet-101 network architecture, which can be used to classify and evaluate the input image based on the dimension of smile .
一般来说,为了使得预设的笑容分类模型可用于实现对笑容程度进行分类和评估的目的,可预先对该笑容分类模型进行训练。即,可选实施方式中,获取笑容分类样本图像,以及相应的笑容类型标注;利用所述笑容分类样本图像和所述笑容类型标注,对预先构建的笑容分类模型进行训练,得到所述训练后的笑容分类模型。Generally speaking, in order to make the preset smile classification model available for the purpose of classifying and evaluating the degree of smile, the smile classification model can be trained in advance. That is, in an optional implementation manner, obtain smile classification sample images and corresponding smile type annotations; use the smile classification sample images and the smile type annotations to train a pre-built smile classification model, and obtain the trained smile classification model.
其中的笑容类型标注是通过人工对笑容分类样本图像进行“笑”或“不笑”判断后而得到的,例如,当笑容分类样本图像中的人脸具有很大的笑容,那么该图像的笑容类型标注为“大笑”;类似的,当笑容分类样本图像中的人脸具有微笑,那么该图像的笑容类型标注为“微笑”;类似 的,当笑容分类样本图像中的人脸不具有笑容,那么该图像的笑容类型标注为“不笑”。而利用笑容分类样本图像以及相应的笑容类型标注,可使得笑容分类模型学习到“笑”的图像、“不笑”的图像,以及“笑”图像中的“笑”的类型,而这些类型还可对应为不同的笑容程度值。其中,“大笑”的笑容程度值最高,而“不笑”的笑容程度值最低。The smile type annotation is obtained by manually judging "smile" or "not smile" on the smile classification sample image. For example, when the face in the smile classification sample image has a big smile, then the smile of the image The type is marked as "smile"; similarly, when the face in the smile classification sample image has a smile, then the smile type of the image is marked as "smile"; similarly, when the face in the smile classification sample image does not have a smile , then the smile type of the image is marked as "not smiling". Using smile classification sample images and corresponding smile type annotations can enable the smile classification model to learn "smile" images, "non-smile" images, and "smile" types in "smile" images, and these types are still It can correspond to different smile degree values. Among them, "laughing" has the highest value of smile degree, while "not smiling" has the lowest value of smile degree.
而利用了训练后的笑容分类模型可对各第一人脸样本图像进行笑容类型识别,从而得到各第一人脸样本图像的笑容程度值。The smile classification model after training can be used to identify the smile type of each first human face sample image, so as to obtain the smile degree value of each first human face sample image.
此外,第一人脸样本图像的隐变量特征是在利用预训练模型生成所述第一人脸样本图像时确定的。In addition, the latent variable features of the first human face sample image are determined when the first human face sample image is generated using a pre-trained model.
图3为本公开提供的一种笑容图像生成模型的训练方法的数据流示意图。参考图3,为了生成第一人脸样本图像,首先预置有一预训练模型,该预训练模型可包括StyleGAN模型,其可生成大量的第一人脸样本图像。在生成第一人脸样本图像时,预训练模型所基于的参数将被记录下来,以构成所述第一人脸样本图像的隐变量特征。FIG. 3 is a schematic diagram of a data flow of a training method for a smile image generation model provided by the present disclosure. Referring to FIG. 3 , in order to generate the first human face sample image, a pre-training model is preset first, and the pre-training model may include a StyleGAN model, which can generate a large number of first human face sample images. When generating the first human face sample image, the parameters on which the pre-training model is based will be recorded to form latent variable features of the first human face sample image.
而在根据第一人脸样本图像构建笑容渐变函数时,可包括如下步骤2021-步骤2024:When constructing the smile gradient function according to the first human face sample image, the following steps 2021-2024 may be included:
步骤2021、根据各所述第一人脸样本图像的笑容程度值,对所述各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本。Step 2021, according to the smile degree value of each of the first human face sample images, classify the latent variable features corresponding to each of the first human face sample images, and obtain the first type of latent variable feature samples and the second class of latent variable feature samples. Class latent variable feature samples.
步骤2022、根据所述第一类隐变量特征样本和所述第二类隐变量特征样本对线性分类器模型进行训练。Step 2022: Train a linear classifier model according to the first type of latent variable feature samples and the second type of latent variable feature samples.
步骤2023、获取训练后的线性分类器模型输出的超平面,并获取该超平面的法向量方向。Step 2023, obtain the hyperplane output by the trained linear classifier model, and obtain the normal vector direction of the hyperplane.
步骤2024、根据所述法向量,得到所述笑容渐变函数。Step 2024, obtain the smile gradient function according to the normal vector.
参考图3,为了能够对线性分类器模型进行训练,以使其输出超平面,首先需要获得可用于对其进行训练的特征样本。其中,如之前原理所述的,线性SVM分类器所确定的超平面,在超平面一侧的特征属于一类别,而在超平面另一侧的特征属于另一类别。基于此,可将隐变量特征样本按照笑容程度值分为两类,以使得其中一类隐变量特征样本作为超平面一侧的特征,另一类隐变量特征样本作为超平面另一侧的特征。Referring to Figure 3, in order to be able to train a linear classifier model so that it outputs a hyperplane, it is first necessary to obtain feature samples that can be used to train it. Among them, as described in the previous principle, the hyperplane determined by the linear SVM classifier, the features on one side of the hyperplane belong to one category, and the features on the other side of the hyperplane belong to another category. Based on this, the latent variable feature samples can be divided into two categories according to the degree of smile, so that one type of latent variable feature samples can be used as features on one side of the hyperplane, and the other type of hidden variable feature samples can be used as features on the other side of the hyperplane .
基于此,可选实施方式中,在获得第一人脸样本图像的笑容类型和隐变量特征之后,对隐变量特征进行分类,得到用于对线性SVM分类器进行训练的第一类隐变量特征样本(正隐变量特征样本)和第二类隐变量特征样本(负隐变量特征样本),以使训练后的线性SVM分类器将第一类隐变量特征样本和第二类隐变量特征样本进行准确分类。Based on this, in an optional embodiment, after obtaining the smile type and latent variable features of the first human face sample image, the latent variable features are classified to obtain the first type of latent variable features used for training the linear SVM classifier samples (positive latent variable feature samples) and the second type of latent variable feature samples (negative latent variable feature samples), so that the trained linear SVM classifier will perform the first type of latent variable feature samples and the second type of latent variable feature samples Accurate classification.
具体的,训练装置将根据所述各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像的隐变量特征进行排序,得到隐变量特征序列;然后,根据隐变量特征序列,对所述各第一人脸样本图像进行分类,得到所述第一类隐变量特征样本和所述第二类隐变量特征样本。Specifically, the training device will sort the hidden variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain a hidden variable feature sequence; then, according to the latent variable feature sequence A variable feature sequence, classifying each of the first human face sample images to obtain the first type of latent variable feature samples and the second type of latent variable feature samples.
其中,对于每一第一人脸样本图像的笑容程度值可用0至10之间的取值进行表征,如0表征为不笑,10表征为开怀大笑。Wherein, the smile degree value for each first human face sample image can be represented by a value between 0 and 10, such as 0 represents not smiling, and 10 represents a wide smile.
基于此,可根据笑容程度值对应的表征取值对第一人脸样本图像进行排序,并得到基于[第一人脸样本图像,隐变量特征,笑容类型]的排序(例如,以笑容类型进行排序),例如:Based on this, the first human face sample images can be sorted according to the characterization values corresponding to the smile degree values, and the sorting based on [the first human face sample images, latent variable features, smile types] (for example, by smile type) can be obtained. sort), for example:
[第一人脸样本图像4,隐变量特征4,笑容类型10];[first face sample image 4, hidden variable feature 4, smile type 10];
[第一人脸样本图像2,隐变量特征2,笑容类型7];[The first face sample image 2, latent variable feature 2, smile type 7];
[第一人脸样本图像1,隐变量特征3,笑容类型1];[First face sample image 1, latent variable feature 3, smile type 1];
[第一人脸样本图像3,隐变量特征3,笑容类型0]。[First face sample image 3, latent variable feature 3, smile type 0].
然后,基于序列,对各隐变量特征分类,得到第一类隐变量特征样本(正隐变量特征样本) 和第二类隐变量特征样本(负隐变量特征样本)。Then, based on the sequence, each hidden variable feature is classified to obtain the first type of hidden variable feature samples (positive latent variable feature samples) and the second type of latent variable feature samples (negative latent variable feature samples).
即,以笑容程度值的表征取值为5作为表征取值的分类中间值,那么上述示例中的第一类隐变量特征样本(正隐变量特征样本)包括:[第一人脸样本图像4,隐变量特征4,笑容类型10]以及[第一人脸样本图像2,隐变量特征2,笑容类型7]。That is, with the characterization value of the smile degree value being 5 as the classification intermediate value of the characterization value, then the first type of latent variable feature samples (positive latent variable feature samples) in the above example include: [the first human face sample image 4 , hidden variable feature 4, smile type 10] and [first face sample image 2, hidden variable feature 2, smile type 7].
而第二类隐变量特征样本(负隐变量特征样本)包括:[第一人脸样本图像1,隐变量特征3,笑容类型1]以及[第一人脸样本图像3,隐变量特征3,笑容类型0]。The second type of hidden variable feature samples (negative hidden variable feature samples) include: [the first face sample image 1, hidden variable feature 3, smile type 1] and [the first face sample image 3, hidden variable feature 3, Smile Type 0].
随后,继续参考图3,训练装置将利用得到的第一类隐变量特征样本和第二类隐变量特征样本对线性SVM分类器(即线性分类器模型)进行训练,以根据训练后的线性SVM分类器的超平面的法向量,得到法向量方向N,而法向量方向N以及在其法向量方向上的步长C的乘积C*N可以构成笑容渐变函数。Subsequently, with reference to Fig. 3, the training device will use the obtained first type hidden variable feature samples and the second type hidden variable feature samples to train the linear SVM classifier (ie, the linear classifier model), so as to use the obtained linear SVM The normal vector of the hyperplane of the classifier obtains the normal vector direction N, and the product C*N of the normal vector direction N and the step size C in the normal vector direction can constitute a smile gradient function.
如前原理所述的,利用该线性SVM分类器可确定出超平面的这一特性,以用来确定前述的笑容渐变函数。其中,可将笑容渐变函数与超平面进行关联,以使超平面用于将对应于“笑”这一笑容类型的特征与“不笑”这一笑容类型的特征区分开来。然后,基于该超平面,可分别确定出超平面的法向量方向N,在该法向量方向N的基础上进行一定移动(即步长C),并最终得到笑容渐变函数C*N,该笑容渐变函数C*N可用于表示图像在“笑”这一特征维度上的幅度,即人脸图像中的笑容程度。As stated above, the linear SVM classifier can be used to determine this property of the hyperplane, which is used to determine the aforementioned smile gradient function. Wherein, the smile gradient function may be associated with the hyperplane, so that the hyperplane is used to distinguish the feature corresponding to the smile type of "smile" from the feature of the smile type of "not smiling". Then, based on the hyperplane, the normal vector direction N of the hyperplane can be determined respectively, and a certain movement (that is, the step size C) is performed on the basis of the normal vector direction N, and finally the smile gradient function C*N is obtained. The gradient function C*N can be used to represent the magnitude of the image on the feature dimension of "smile", that is, the degree of smile in the face image.
当获得笑容渐变函数C*N之后,还可如步骤103所示的,利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组。After the smile gradient function C*N is obtained, as shown in step 103, the smile gradient function can be used to perform a second process on a plurality of second human face sample images to obtain a sample smile image group.
具体的,第二样本人脸图像可包括:对将噪声数据添加至第一人脸样本图像而得到的人脸图像,也可为通过随机生成的方式生成的人脸图像。Specifically, the second sample human face image may include: a human face image obtained by adding noise data to the first human face sample image, or a human face image generated by random generation.
在该步骤中,可基于所述笑容渐变函数,对每个所述第二人脸样本图像在所述法向量方向上的幅值进行多次不同步长的移动,得到每个第二人脸样本图像在不同笑容程度下的笑容图像;所述各个第二人脸样本图像在各笑容程度下的笑容图像构成所述样本笑容图像组。In this step, based on the smile gradient function, the amplitude of each of the second human face sample images in the direction of the normal vector can be moved multiple times with different step lengths to obtain each second human face The smile images of the sample images at different smile levels; the smile images of the second human face sample images at each smile level constitute the sample smile image group.
即,通过改变笑容渐变函数C*N中的步长C,以形成多个笑容渐变函数,并基于该多个笑容渐变函数分别对第二人脸样本图像进行图像处理,得到每个第二人脸样本图像的多个笑容图像。其中,步长C对应不同的笑容程度,即,当步长C变化时,得到笑容图像中在“笑”这一特征维度上的幅度是不同的,即笑容程度不同。That is, by changing the step size C in the smile gradient function C*N, multiple smile gradient functions are formed, and image processing is performed on the second face sample image based on the multiple smile gradient functions, to obtain each second person Multiple smile images of face sample images. Wherein, the step size C corresponds to different smile degrees, that is, when the step size C changes, the obtained smile images have different amplitudes on the characteristic dimension of "smile", that is, the smile degrees are different.
通过该方式可获得样本笑容图像组,该样本笑容图像组中可以包括有不同第二人脸样本图像在不同笑容程度下的笑容图像。In this manner, a sample smile image group can be obtained, and the sample smile image group may include smile images of different second human face sample images at different smile levels.
该得到的样本笑容图像组将用于对笑容图像生成模型的训练,以生成训练后的笑容图像生成模型。The obtained sample smile image group will be used for training the smile image generation model to generate a trained smile image generation model.
笑容图像生成模型具体可包括基于条件的生成对抗网络(Conditional Generative Adversarial Network),以使得输出的笑容图像的笑容程度可控。The smile image generation model may specifically include a conditional generative adversarial network (Conditional Generative Adversarial Network), so that the smile degree of the output smile image is controllable.
因此,在训练模型时,可将前述得到的样本笑容图像组中的每个笑容图像以及相应的笑容程度作为输入,以使模型可学习笑容程度与笑容图像之间的对应关系。Therefore, when training the model, each smile image and the corresponding smile degree in the sample smile image group obtained above can be used as input, so that the model can learn the correspondence between the smile degree and the smile image.
图4为本公开实施例提供的一种笑容图像生成模型的训练方法的数据流示意图。如图4所示的,与传统的GAN架构模型类似,本实施方式中的笑容图像生成模型包括有生成器和判别器,其中的生成器用于生成图像,而判别器用于判别生成器所生成图像的真伪。通过训练,生成器所生成的图像将通过判别器的判别(即判别为真),以输出作为输出的图像。FIG. 4 is a schematic diagram of a data flow of a training method for a smile image generation model provided by an embodiment of the present disclosure. As shown in Figure 4, similar to the traditional GAN architecture model, the smile image generation model in this embodiment includes a generator and a discriminator, where the generator is used to generate images, and the discriminator is used to distinguish the images generated by the generator authenticity. Through training, the image generated by the generator will pass the discrimination of the discriminator (that is, the discrimination is true), and the output image will be output.
与传统的GAN架构模型不同的是,在本实施方式中,由于模型的目标是输出具有期望笑容程 度的笑容图像,因此,模型的判别器在判定输出图像的真伪的同时,还将进行图像的笑容程度的判定。Different from the traditional GAN architecture model, in this embodiment, since the goal of the model is to output a smile image with a desired smile level, the discriminator of the model will not only determine the authenticity of the output image, but also perform image determination of the degree of smile.
具体的,模型可对所述目标人脸图像进行基于嘴部区域的嘴部判别器以进行监督处理,通过所述嘴部判别器对笑容图像生成模型所输出的笑容图像进行基于嘴部区域的监督处理。Specifically, the model can perform a mouth discriminator based on the mouth region on the target face image to perform supervisory processing. Supervise processing.
基于上述方式,当生成器所生成的图像即可通过判别器对于图像真伪的判别,又可通过判别器对其笑容程度的判定时,此时得到模型为训练后笑容图像生成模型。Based on the above method, when the image generated by the generator can pass the discrimination of the authenticity of the image by the discriminator, and the judgment of the smile level by the discriminator, the model obtained at this time is the smile image generation model after training.
本公开实施例提供的笑容图像生成模型的训练方法,获取第一人脸样本图像;对第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,笑容渐变函数用于表示人脸样本图像中的笑容程度;利用笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;样本笑容图像组中包括每个第二人脸样本图像在不同笑容程度下的笑容图像;利用样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸的在期望笑容程度下的笑容图像,为用户带来更多的体验。The method for training a smile image generation model provided by an embodiment of the present disclosure includes acquiring a first human face sample image; performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent a human face sample The degree of smile in the image; using the smile gradient function, the second processing is performed on a plurality of second human face sample images to obtain a sample smile image group; the sample smile image group includes each second human face sample image under different smile degrees the smile image; use the sample smile image group to train the preset smile image generation model, and obtain the trained smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
第二方面,图5为本公开实施例提供的一种笑容图像的生成方法的流程示意图。参考图5,本公开实施例提供的笑容图像的生成方法,包括:In the second aspect, FIG. 5 is a schematic flowchart of a method for generating a smile image provided by an embodiment of the present disclosure. Referring to FIG. 5 , the method for generating a smile image provided by an embodiment of the present disclosure includes:
步骤501、获取目标人脸图像; Step 501, acquiring a target face image;
步骤502、将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;其中,所述笑容图像生成模型通过如上述任一项实施方式所述的训练方法训练得到。Step 502: Input the target face image and expected smile level into the trained smile image generation model, so that the output of the trained smile image generation model is equal to the target face image at the expected smile level smile image; wherein, the smile image generation model is trained by the training method described in any one of the above implementations.
当模型完成训练后,笑容图像的生成装置可获取目标人脸图像和期望笑容程度,然后,将二者输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像。After the model is trained, the device for generating smile images can obtain target face images and desired smile levels, and then input the two into the trained smile image generation model, so that the output of the trained smile image generation model is the same as A smile image of the target face image at the desired smile level.
其中的期望笑容程度可为用户预设的;同时,基于不同的应用需求,该期望笑容程度可为单一的程度,如“开怀大笑”,该期望笑容程度还可为一系列程度,如“不笑至开怀大笑”。The desired degree of smile can be preset by the user; meanwhile, based on different application requirements, the desired degree of smile can be a single degree, such as "laughing heartily", and the desired degree of smile can also be a series of degrees, such as " Don't laugh until you laugh."
本公开实施例提供的笑容图像的生成方法,获取第一人脸样本图像;对第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,笑容渐变函数用于表示人脸样本图像中的笑容程度;利用笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;样本笑容图像组中包括每个第二人脸样本图像在不同笑容程度下的笑容图像;利用样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸的在期望笑容程度下的笑容图像,为用户带来更多的体验。In the method for generating a smile image provided by an embodiment of the present disclosure, the first human face sample image is acquired; the first processing is performed on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the face sample image degree of smile; using the smile gradient function, a plurality of second face sample images are processed for the second time to obtain a sample smile image group; the sample smile image group includes the smiles of each second face sample image under different smile degrees image; using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
对应于上文实施例的笑容图像生成模型的训练方法,图6为本公开实施例提供的笑容图像生成模型的训练装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。Corresponding to the method for training a smile image generation model in the above embodiment, FIG. 6 is a structural block diagram of a training device for a smile image generation model provided by an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown.
参照图6,所述笑容图像生成模型的训练装置包括:图像获取模块610、第一处理模块620、第二处理模块630、第三处理模块640。Referring to FIG. 6 , the training device for the smile image generation model includes: an image acquisition module 610 , a first processing module 620 , a second processing module 630 , and a third processing module 640 .
图像获取模块610,用于获取第一人脸样本图像;An image acquisition module 610, configured to acquire the first human face sample image;
第一处理模块620,用于对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;The first processing module 620 is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
第二处理模块630,用于利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;The second processing module 630 is configured to use the smile gradient function to perform a second process on a plurality of second face sample images to obtain a sample smile image group; wherein each of the second person is included in the sample smile image group Smile images of face sample images at different smile levels;
第三处理模块640,用于利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。The third processing module 640 is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
可选的,所述第一人脸样本图像的数量为多个,每个第一人脸样本图像的图像数据中包括笑容程度值和隐变量特征。Optionally, there are multiple first human face sample images, and the image data of each first human face sample image includes a smile degree value and latent variable features.
第一处理模块620,用于根据各所述第一人脸样本图像的笑容程度值,对所述各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本;根据所述第一类隐变量特征样本和所述第二类隐变量特征样本对线性分类器模型进行训练;获取训练后的线性分类器模型输出的超平面,并获取所述超平面的法向量方向;根据所述法向量,得到所述笑容渐变函数。The first processing module 620 is configured to perform sample classification on the latent variable features corresponding to each of the first human face sample images according to the smile degree value of each of the first human face sample images, to obtain a first type of latent variable feature sample and the second type of hidden variable feature sample; according to the first type of hidden variable feature sample and the second type of hidden variable feature sample, the linear classifier model is trained; the super output of the trained linear classifier model is obtained plane, and obtain the normal vector direction of the hyperplane; obtain the smile gradient function according to the normal vector.
可选的,所述第二处理模块630,用于基于所述笑容渐变函数,对每个所述第二人脸样本图像在所述法向量方向上的幅值进行多次不同步长的移动,得到每个第二人脸样本图像在不同笑容程度下的笑容图像;所述各个第二人脸样本图像在各笑容程度下的笑容图像构成所述样本笑容图像组。Optionally, the second processing module 630 is configured to move the magnitude of each of the second face sample images in the direction of the normal vector multiple times with different step lengths based on the smile gradient function , to obtain the smile images of each second human face sample image at different smile levels; the smile images of each second human face sample image at each smile level constitute the sample smile image group.
可选的,第一处理模块620,用于根据所述各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像的隐变量特征进行排序,得到隐变量特征序列;根据隐变量特征序列,对所述各第一人脸样本图像进行分类,得到所述第一类隐变量特征样本和所述第二类隐变量特征样本。Optionally, the first processing module 620 is configured to sort the latent variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain the latent variable feature Sequence: Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
可选的,所述获取第一人脸样本图像的隐变量特征是在利用预训练模型生成所述第一人脸样本图像时确定的。Optionally, the acquired latent variable features of the first human face sample image are determined when using a pre-trained model to generate the first human face sample image.
可选的,所述根据各所述第一人脸样本图像的图像数据中的笑容程度值是利用训练后的笑容分类模型对所述各第一人脸样本图像中的人脸进行笑容分类而获得的。Optionally, according to the smile degree value in the image data of each of the first human face sample images, the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
可选的,所述第一处理模块620,还用于在获取第一人脸样本图像的图像数据之前,获取笑容分类样本图像,以及相应的笑容类型标注;利用所述笑容分类样本图像和所述笑容类型标注,对预先构建的笑容分类模型进行训练,得到所述训练后的笑容分类模型。Optionally, the first processing module 620 is further configured to obtain a smile classification sample image and a corresponding smile type label before obtaining the image data of the first face sample image; use the smile classification sample image and the The smile type is marked, and the pre-built smile classification model is trained to obtain the trained smile classification model.
可选的,所述笑容图像生成模型包括生成对抗网络,所述生成对抗网络包括嘴部判别器;相应地,第三处理模块640,用于通过所述嘴部判别器对笑容图像生成模型所输出的笑容图像进行基于嘴部区域的监督处理。Optionally, the smile image generation model includes a generative adversarial network, and the generative adversarial network includes a mouth discriminator; correspondingly, the third processing module 640 is configured to use the mouth discriminator to analyze the smile image generation model The output smile image is subjected to supervised processing based on the mouth region.
本公开实施例提供的笑容图像生成模型的训练装置,获取第一人脸样本图像;对第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,笑容渐变函数用于表示人脸样本图像中的笑容程度;利用笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;样本笑容图像组中包括每个第二人脸样本图像在不同笑容程度下的笑容图像;利用样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸的在期望笑容程度下的笑容图像,为用户带来更多的体验。The training device for a smile image generation model provided by an embodiment of the present disclosure acquires a first human face sample image; performs a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent a human face sample The degree of smile in the image; using the smile gradient function, the second processing is performed on a plurality of second human face sample images to obtain a sample smile image group; the sample smile image group includes each second human face sample image under different smile degrees the smile image; use the sample smile image group to train the preset smile image generation model, and obtain the trained smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
对应于上文实施例的笑容图像的生成方法,图7为本公开实施例提供的笑容图像的生成装置的结构框图。为了便于说明,仅示出了与本公开实施例相关的部分。参照图7,所述笑容图像生成模型的训练装置包括:图像获取模块710、图像生成模块720。Corresponding to the method for generating a smile image in the above embodiment, FIG. 7 is a structural block diagram of a device for generating a smile image provided by an embodiment of the present disclosure. For ease of description, only the parts related to the embodiments of the present disclosure are shown. Referring to FIG. 7 , the training device for the smile image generation model includes: an image acquisition module 710 and an image generation module 720 .
图像获取模块710,用于获取目标人脸图像;An image acquisition module 710, configured to acquire a target face image;
图像生成模块720,用于将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;The image generation module 720 is configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image in the An image of a smile at the desired smile level;
其中,所述笑容图像生成模型通过如上述训练方法训练得到。Wherein, the smile image generation model is trained by the above-mentioned training method.
本公开实施例提供的笑容图像的生成装置,获取第一人脸样本图像;对第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,笑容渐变函数用于表示人脸样本图像中的笑容程度;利用笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;样本笑容图像组中包括每个第二人脸样本图像在不同笑容程度下的笑容图像;利用样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。利用该训练后的笑容图像生成模型,能够对目标人脸图像进行处理,以生成与期望笑容程度一致的笑容图像。通过这样的方式能够获得人脸的在期望笑容程度下的笑容图像,为用户带来更多的体验。The device for generating a smile image provided by an embodiment of the present disclosure acquires a first human face sample image; performs a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the face sample image degree of smile; using the smile gradient function, a plurality of second human face sample images are processed for the second time to obtain a sample smile image group; the sample smile image group includes the smiles of each second human face sample image under different smile degrees image; using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model. Using the trained smile image generation model, the target face image can be processed to generate a smile image consistent with the expected smile degree. In this way, a smile image of a human face at a desired smile level can be obtained, bringing more experience to the user.
本公开实施例提供的电子设备,可用于执行上述方法实施例的技术方案,其实现原理和技术效果类似,本公开实施例此处不再赘述。The electronic device provided by the embodiments of the present disclosure can be used to implement the technical solutions of the above method embodiments, and its implementation principles and technical effects are similar, so the embodiments of the present disclosure will not repeat them here.
参考图8,其示出了适于用来实现本公开实施例的电子设备900的结构示意图,该电子设备900可以为终端设备或媒体库。其中,终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,简称PDA)、平板电脑(Portable Android Device,简称PAD)、便携式多媒体播放器(Portable Media Player,简称PMP)、车载终端(例如车载导航终端)、可穿戴电子设备等等的移动终端以及诸如数字TV、台式计算机、智能家居设备等等的固定终端。图8示出的电子设备仅仅是一个实施例,不应对本公开实施例的功能和使用范围带来任何限制。Referring to FIG. 8 , it shows a schematic structural diagram of an electronic device 900 suitable for implementing the embodiments of the present disclosure. The electronic device 900 may be a terminal device or a media library. Among them, the terminal equipment may include but not limited to mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA for short), tablet computers (Portable Android Device, PAD for short), portable multimedia players (Portable Media Player, referred to as PMP), vehicle-mounted terminals (such as vehicle-mounted navigation terminals), mobile terminals such as wearable electronic devices, and fixed terminals such as digital TVs, desktop computers, and smart home devices. The electronic device shown in FIG. 8 is only an embodiment, and should not limit the functions and application scope of the embodiments of the present disclosure.
如图8所示,电子设备900可以包括用于执行上述各方法的处理器901(例如中央处理器、图形处理器等),其可以根据存储在只读存储器(Read Only Memory,简称ROM)902中的程序或者从存储装置908加载到随机访问存储器(Random Access Memory,简称RAM)903中的程序而执行各种适当的动作和处理。在RAM 903中,还存储有电子设备900操作所需的各种程序和数据。处理器901、ROM 902以及RAM 903通过总线904彼此相连。输入/输出(I/O)接口905也连接至总线904。As shown in FIG. 8 , an electronic device 900 may include a processor 901 (such as a central processing unit, a graphics processing unit, etc.) for performing the above methods, which may be stored in a read-only memory (Read Only Memory, ROM for short) 902 Various appropriate actions and processes are executed by the program in the memory or the program loaded from the storage device 908 into the random access memory (Random Access Memory, RAM for short) 903 . In the RAM 903, various programs and data necessary for the operation of the electronic device 900 are also stored. The processor 901, ROM 902, and RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904 .
通常,以下装置可以连接至I/O接口905:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置906;包括例如液晶屏幕(Liquid Crystal Display,简称LCD)、扬声器、振动器等的输出装置907;包括例如磁带、硬盘等的存储装置908;以及通信装置909。通信装置909可以允许电子设备900与其他设备进行无线或有线通信以交换数据。虽然图8示出了具有各种装置的电子设备900,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Generally, the following devices can be connected to the I/O interface 905: an input device 906 including, for example, a touch screen, a touchpad, a keyboard, a mouse, a camera, a microphone, an accelerometer, a gyroscope, etc.; ), a speaker, a vibrator, etc.; a storage device 908 including, for example, a magnetic tape, a hard disk, etc.; and a communication device 909. The communication means 909 may allow the electronic device 900 to perform wireless or wired communication with other devices to exchange data. While FIG. 8 shows electronic device 900 having various means, it is to be understood that implementing or having all of the means shown is not a requirement. More or fewer means may alternatively be implemented or provided.
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行根据本公开实施例所述的各流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置909从网络上被下载和安装,或者从存储装置908被安装,或者从ROM 902被安装。在该计算机程序被处理器901执行时,执行本公开实施例的方法中限定的上述功能。In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts can be implemented as computer software programs. For example, an embodiment of the present disclosure includes a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program includes instructions for executing the methods shown in the flow charts described in the embodiments of the present disclosure. code. In such an embodiment, the computer program may be downloaded and installed from a network via communication means 909, or from storage means 908, or from ROM 902. When the computer program is executed by the processor 901, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Electrically Programmable Read Only Memory,简称EPROM)、闪存、光纤、便携式紧凑磁盘只读存储器(Compact Disc-Read Only Memory,简称CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,简称RF)等等,或者上述的任意合适的组合。It should be noted that the above-mentioned computer-readable medium in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two. A computer readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable Electrically Programmable Read Only Memory (EPROM for short), flash memory, optical fiber, portable compact disk read-only memory (Compact Disc-Read Only Memory, CD-ROM for short), optical storage device, magnetic storage device, or the above any suitable combination. In the present disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present disclosure, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can transmit, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device . The program code contained on the computer readable medium can be transmitted by any appropriate medium, including but not limited to: electric wire, optical cable, radio frequency (Radio Frequency, RF for short), etc., or any suitable combination of the above.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device, or may exist independently without being incorporated into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备执行上述实施例所示的方法。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device is made to execute the methods shown in the above-mentioned embodiments.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或媒体库上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,简称LAN)或广域网(Wide Area Network,简称WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out the operations of the present disclosure can be written in one or more programming languages, or combinations thereof, including object-oriented programming languages—such as Java, Smalltalk, C++, and conventional Procedural Programming Language - such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or media library. In cases involving a remote computer, the remote computer can be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or it can be connected to an external A computer (connected via the Internet, eg, using an Internet service provider).
附图中的流程图和框图,图示了按照本公开各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in a flowchart or block diagram may represent a module, program segment, or portion of code that contains one or more logical functions for implementing specified executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. It should also be noted that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by a dedicated hardware-based system that performs the specified functions or operations , or may be implemented by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,第一获取单元还可以被描述为“获取至少两个网际协议地址的单元”。The units involved in the embodiments described in the present disclosure may be implemented by software or by hardware. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit for obtaining at least two Internet Protocol addresses".
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array, 简称FPGA)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、专用标准产品(Application Specific Standard Product,简称ASSP)、片上***(System On Chip,简称SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,简称CPLD)等等。The functions described herein above may be performed at least in part by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: Field Programmable Gate Array (Field Programmable Gate Array, FPGA for short), Application Specific Integrated Circuit (ASIC for short), Application Specific Standard Products ( Application Specific Standard Product (ASSP for short), System On Chip (SOC for short), Complex Programmable Logic Device (CPLD for short), etc.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行***、装置或设备使用或与指令执行***、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体***、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体实施例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM)、快闪存储器、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM), flash memory, fiber optics, compact disc read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
以下是本公开的一些实施例。The following are some examples of the present disclosure.
第一方面,根据本公开的一个或多个实施例,一种笑容图像生成模型的训练方法,包括:In the first aspect, according to one or more embodiments of the present disclosure, a method for training a smile image generation model includes:
获取第一人脸样本图像;Obtain the first face sample image;
对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;Performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;Using the smile gradient function, a plurality of second human face sample images are subjected to the second processing to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images under different smile degrees smile image;
利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。Using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model.
可选的,所述第一人脸样本图像的数量为多个,每个第一人脸样本图像的图像数据中包括笑容程度值和隐变量特征;Optionally, there are multiple first human face sample images, and the image data of each first human face sample image includes a smile degree value and a latent variable feature;
所述对所述第一人脸样本图像进行第一处理,得到笑容渐变函数,包括:The first processing is performed on the first human face sample image to obtain a smile gradient function, including:
根据各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本;According to the smile degree value of each of the first human face sample images, the latent variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of latent variable feature samples ;
根据所述第一类隐变量特征样本和所述第二类隐变量特征样本对线性分类器模型进行训练;training a linear classifier model according to the first type of latent variable feature samples and the second type of latent variable feature samples;
获取训练后的线性分类器模型输出的超平面,获取所述超平面的法向量方向;Obtain the hyperplane output by the trained linear classifier model, and obtain the normal vector direction of the hyperplane;
根据所述法向量,得到所述笑容渐变函数。According to the normal vector, the smile gradient function is obtained.
可选的,所述利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组,包括:Optionally, the second processing is performed on a plurality of second human face sample images by using the smile gradient function to obtain a sample smile image group, including:
基于所述笑容渐变函数,对每个所述第二人脸样本图像在所述法向量方向上的幅值进行多次不同步长的移动,得到每个第二人脸样本图像在不同笑容程度下的笑容图像;Based on the smile gradient function, the amplitude of each of the second human face sample images in the direction of the normal vector is moved multiple times with different step lengths to obtain each second human face sample image at different smile levels Smile image below;
所述各个第二人脸样本图像在各笑容程度下的笑容图像构成所述样本笑容图像组。The smile images of the second human face sample images at different smile levels constitute the sample smile image group.
可选的,根据各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本,包括:Optionally, according to the smile degree value of each of the first human face sample images, the hidden variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of hidden variable feature samples. Hidden variable feature samples, including:
根据所述各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像的隐变量特征进行排序,得到隐变量特征序列;According to the smile degree value of each of the first human face sample images, the hidden variable features of each of the first human face sample images are sorted to obtain a hidden variable feature sequence;
根据隐变量特征序列,对所述各第一人脸样本图像进行分类,得到所述第一类隐变量特征样本和所述第二类隐变量特征样本。Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
可选的,所述获取第一人脸样本图像的隐变量特征是在利用预训练模型生成所述第一人脸样 本图像时确定的。Optionally, said acquiring the latent variable features of the first human face sample image is determined when the first human face sample image is generated using a pre-trained model.
可选的,所述根据各所述第一人脸样本图像的图像数据中的笑容程度值是利用训练后的笑容分类模型对所述各第一人脸样本图像中的人脸进行笑容分类而获得的。Optionally, according to the smile degree value in the image data of each of the first human face sample images, the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
可选的,所述获取第一人脸样本图像的图像数据之前,还包括:Optionally, before acquiring the image data of the first human face sample image, it also includes:
获取笑容分类样本图像,以及相应的笑容类型标注;Obtain smile classification sample images and corresponding smile type annotations;
利用所述笑容分类样本图像和所述笑容类型标注,对预先构建的笑容分类模型进行训练,得到所述训练后的笑容分类模型。The pre-built smile classification model is trained by using the smile classification sample image and the smile type label to obtain the trained smile classification model.
可选的,所述笑容图像生成模型包括生成对抗网络,所述生成对抗网络包括嘴部判别器;Optionally, the smile image generation model includes a generation adversarial network, and the generation adversarial network includes a mouth discriminator;
所述方法包括:The methods include:
通过所述嘴部判别器对笑容图像生成模型所输出的笑容图像进行基于嘴部区域的监督处理。The mouth region-based supervision process is performed on the smile image output by the smile image generation model through the mouth discriminator.
第二方面,根据本公开的一个或多个实施例,一种笑容图像的生成方法,包括:In the second aspect, according to one or more embodiments of the present disclosure, a method for generating a smile image includes:
获取目标人脸图像;Obtain the target face image;
将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;Inputting the target face image and the expected smile degree into the trained smile image generation model, so that the trained smile image generation model outputs a smile image corresponding to the target face image under the expected smile degree ;
其中,所述笑容图像生成模型通过如上述第一方面中任一项所述的训练方法训练得到。Wherein, the smile image generation model is trained by the training method described in any one of the above first aspects.
第三方面,根据本公开的一个或多个实施例,一种笑容图像生成模型的训练装置,包括:In a third aspect, according to one or more embodiments of the present disclosure, a training device for a smile image generation model includes:
图像获取模块,用于获取第一人脸样本图像;An image acquisition module, configured to acquire the first human face sample image;
第一处理模块,用于对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;The first processing module is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
第二处理模块,用于利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;The second processing module is configured to use the smile gradient function to perform second processing on a plurality of second human face sample images to obtain a sample smile image group; wherein each of the second human faces is included in the sample smile image group Smile images of sample images at different smile levels;
第三处理模块,用于利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。The third processing module is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
可选的,所述第一人脸样本图像的数量为多个,每个第一人脸样本图像的图像数据中包括笑容程度值和隐变量特征;Optionally, there are multiple first human face sample images, and the image data of each first human face sample image includes a smile degree value and a latent variable feature;
第一处理模块,具体用于根据各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本;根据所述第一类隐变量特征样本和所述第二类隐变量特征样本对线性分类器模型进行训练;获取训练后的线性分类器模型输出的超平面,获取所述超平面的法向量方向;根据所述法向量,得到所述笑容渐变函数。The first processing module is specifically configured to classify the hidden variable features corresponding to each of the first human face sample images according to the smile degree value of each of the first human face sample images, and obtain the first type of latent variable feature samples and the second type of hidden variable feature samples; according to the first type of hidden variable feature samples and the second type of hidden variable feature samples, the linear classifier model is trained; the hyperplane output by the trained linear classifier model is obtained, Obtain the direction of the normal vector of the hyperplane; obtain the gradient function of the smile according to the normal vector.
可选的,所述第二处理模块,具体用于基于所述笑容渐变函数,对每个所述第二人脸样本图像在所述法向量方向上的幅值进行多次不同步长的移动,得到每个第二人脸样本图像在不同笑容程度下的笑容图像;所述各个第二人脸样本图像在各笑容程度下的笑容图像构成所述样本笑容图像组。Optionally, the second processing module is specifically configured to, based on the smile gradient function, move the amplitude of each of the second face sample images in the direction of the normal vector multiple times with different step lengths , to obtain the smile images of each second human face sample image at different smile levels; the smile images of each second human face sample image at each smile level constitute the sample smile image group.
可选的,第一处理模块,具体用于根据所述各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像的隐变量特征进行排序,得到隐变量特征序列;根据隐变量特征序列,对所述各第一人脸样本图像进行分类,得到所述第一类隐变量特征样本和所述第二类隐变量特征样本。Optionally, the first processing module is specifically configured to sort the hidden variable features of each of the first human face sample images according to the smile degree value of each of the first human face sample images to obtain the latent variable features Sequence: Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
可选的,所述获取第一人脸样本图像的隐变量特征是在利用预训练模型生成所述第一人脸样本图像时确定的。Optionally, the acquired latent variable features of the first human face sample image are determined when using a pre-trained model to generate the first human face sample image.
可选的,所述根据各所述第一人脸样本图像的图像数据中的笑容程度值是利用训练后的笑容分类模型对所述各第一人脸样本图像中的人脸进行笑容分类而获得的。Optionally, according to the smile degree value in the image data of each of the first human face sample images, the trained smile classification model is used to classify the smiles of the faces in the first human face sample images. acquired.
可选的,所述第一处理模块,具体还用于在获取第一人脸样本图像的图像数据之前,获取笑容分类样本图像,以及相应的笑容类型标注;利用所述笑容分类样本图像和所述笑容类型标注,对预先构建的笑容分类模型进行训练,得到所述训练后的笑容分类模型。Optionally, the first processing module is further configured to obtain a smile classification sample image and corresponding smile type labels before obtaining the image data of the first human face sample image; using the smile classification sample image and the The smile type is marked, and the pre-built smile classification model is trained to obtain the trained smile classification model.
可选的,所述笑容图像生成模型包括生成对抗网络,所述生成对抗网络包括嘴部判别器;Optionally, the smile image generation model includes a generation adversarial network, and the generation adversarial network includes a mouth discriminator;
第三处理模块,具体用于通过所述嘴部判别器对笑容图像生成模型所输出的笑容图像进行基于嘴部区域的监督处理。The third processing module is specifically configured to perform supervisory processing based on the mouth area on the smile image output by the smile image generation model through the mouth discriminator.
第四方面,本公开实施例提供了一种笑容图像的生成装置,包括:In a fourth aspect, an embodiment of the present disclosure provides an apparatus for generating a smile image, including:
图像获取模块,用于获取目标人脸图像;An image acquisition module, configured to acquire a target face image;
图像生成模块,用于将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;An image generation module, configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image at the expected Smile images by smile level;
其中,所述笑容图像生成模型通过如上述所述的训练方法训练得到。Wherein, the smile image generation model is trained by the above-mentioned training method.
第五方面,本公开实施例提供一种电子设备,包括:至少一个处理器和存储器;In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor and a memory;
所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or Or, the second aspect and various possibilities of the second aspect relate to the method for generating the smile image.
第六方面,本公开实施例提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In the sixth aspect, the embodiments of the present disclosure provide a computer-readable storage medium, the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the above first aspect and the first The various possible training methods of the smile image generation model in the aspect, and/or, the second aspect and the various possible methods of the smile image generation in the second aspect.
第七方面,本公开实施例提供一种计算机程序产品,包括计算机指令,该计算机指令被处理器执行时实现如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In the seventh aspect, the embodiments of the present disclosure provide a computer program product, including computer instructions, which, when executed by a processor, implement the above first aspect and various possible trainings related to the smile image generation model in the first aspect. The method, and/or, the second aspect and various possible methods of generating the smile image related to the second aspect.
第八方面,本公开实施例提供一种计算机程序,该计算机程序被处理器执行时如上第一方面以及第一方面各种可能的涉及所述的笑容图像生成模型的训练方法,和/或,第二方面以及第二方面各种可能的涉及所述的笑容图像的生成方法。In an eighth aspect, an embodiment of the present disclosure provides a computer program. When the computer program is executed by a processor, the above first aspect and various possible training methods related to the smile image generation model in the first aspect, and/or, The second aspect and various possibilities of the second aspect relate to the method for generating the smile image.
以上描述仅为本公开的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本公开中所涉及的公开范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述公开构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术 方案。例如上述特征与本公开中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。The above description is only a preferred embodiment of the present disclosure and an illustration of the applied technical principles. Those skilled in the art should understand that the disclosure scope involved in this disclosure is not limited to the technical solution formed by the specific combination of the above-mentioned technical features, but also covers the technical solutions formed by the above-mentioned technical features or Other technical solutions formed by any combination of equivalent features. For example, a technical solution formed by replacing the above-mentioned features with technical features disclosed in this disclosure (but not limited to) having similar functions.
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。In addition, while operations are depicted in a particular order, this should not be understood as requiring that the operations be performed in the particular order shown or performed in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while the above discussion contains several specific implementation details, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的实施例形式。Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are merely example forms of implementing the claims.

Claims (15)

  1. 一种笑容图像生成模型的训练方法,其特征在于,包括:A method for training a smile image generation model, comprising:
    获取第一人脸样本图像;Obtain the first face sample image;
    对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;Performing a first process on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
    利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;Using the smile gradient function, a plurality of second human face sample images are subjected to the second processing to obtain a sample smile image group; wherein, the sample smile image group includes each of the second human face sample images under different smile degrees smile image;
    利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。Using the sample smile image group to train the preset smile image generation model to obtain the trained smile image generation model.
  2. 根据权利要求1所述的笑容图像生成模型的训练方法,其特征在于,所述第一人脸样本图像的数量为多个,每个第一人脸样本图像的图像数据中包括笑容程度值和隐变量特征;The training method of the smile image generation model according to claim 1, wherein the number of the first human face sample images is multiple, and the image data of each first human face sample image includes a smile degree value and Hidden variable features;
    所述对所述第一人脸样本图像进行第一处理,得到笑容渐变函数,包括:The first processing is performed on the first human face sample image to obtain a smile gradient function, including:
    根据各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本;According to the smile degree value of each of the first human face sample images, the latent variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of latent variable feature samples ;
    根据所述第一类隐变量特征样本和所述第二类隐变量特征样本对线性分类器模型进行训练;training a linear classifier model according to the first type of latent variable feature samples and the second type of latent variable feature samples;
    获取训练后的线性分类器模型输出的超平面,并获取所述超平面的法向量方向;Obtain the hyperplane output by the trained linear classifier model, and obtain the normal vector direction of the hyperplane;
    根据所述法向量,得到所述笑容渐变函数。According to the normal vector, the smile gradient function is obtained.
  3. 根据权利要求2所述的笑容图像生成模型的训练方法,其特征在于,所述利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组,包括:The method for training a smile image generation model according to claim 2, wherein the second processing is performed on a plurality of second human face sample images by using the smile gradient function to obtain a sample smile image group, comprising:
    基于所述笑容渐变函数,对每个所述第二人脸样本图像在所述法向量方向上的幅值进行多次不同步长的移动,得到每个第二人脸样本图像在不同笑容程度下的笑容图像;Based on the smile gradient function, the amplitude of each of the second human face sample images in the direction of the normal vector is moved multiple times with different step lengths to obtain each second human face sample image at different smile levels Smile image below;
    所述各个第二人脸样本图像在各笑容程度下的笑容图像构成所述样本笑容图像组。The smile images of the second human face sample images at different smile levels constitute the sample smile image group.
  4. 根据权利要求2所述的笑容图像生成模型的训练方法,其特征在于,The training method of smile image generation model according to claim 2, is characterized in that,
    根据各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像对应的隐变量特征进行样本分类,得到第一类隐变量特征样本和第二类隐变量特征样本,包括:According to the smile degree value of each of the first human face sample images, the latent variable features corresponding to each of the first human face sample images are sample classified to obtain the first type of latent variable feature samples and the second type of latent variable feature samples ,include:
    根据所述各所述第一人脸样本图像的笑容程度值,对各所述第一人脸样本图像的隐变量特征进行排序,得到隐变量特征序列;According to the smile degree value of each of the first human face sample images, the hidden variable features of each of the first human face sample images are sorted to obtain a hidden variable feature sequence;
    根据隐变量特征序列,对所述各第一人脸样本图像进行分类,得到所述第一类隐变量特征样本和所述第二类隐变量特征样本。Classify the first face sample images according to the hidden variable feature sequence to obtain the first type of hidden variable feature samples and the second type of hidden variable feature samples.
  5. 根据权利要求2或4所述的笑容图像生成模型的训练方法,其特征在于,所述获取第一人脸样本图像的隐变量特征是在利用预训练模型生成所述第一人脸样本图像时确定的。The training method of a smile image generation model according to claim 2 or 4, wherein said acquiring the latent variable feature of the first human face sample image is when a pre-trained model is used to generate said first human face sample image definite.
  6. 根据权利要求2或4所述的笑容图像生成模型的训练方法,其特征在于,所述根据各所述第一人脸样本图像的图像数据中的笑容程度值是利用训练后的笑容分类模型对所述各第一人脸样本图像中的人脸进行笑容分类而获得的。According to the training method of the smile image generation model described in claim 2 or 4, it is characterized in that, the smile degree value in the image data of each of the first human face sample images is based on the smile classification model after training. The faces in each of the first face sample images are obtained by classifying smiles.
  7. 根据权利要求6所述的笑容图像生成模型的训练方法,其特征在于,所述获取第一人脸样本图像的图像数据之前,还包括:The training method of the smile image generation model according to claim 6, wherein, before the acquisition of the image data of the first human face sample image, further comprising:
    获取笑容分类样本图像,以及相应的笑容类型标注;Obtain smile classification sample images and corresponding smile type annotations;
    利用所述笑容分类样本图像和所述笑容类型标注,对预先构建的笑容分类模型进行训练,得到所述训练后的笑容分类模型。The pre-built smile classification model is trained by using the smile classification sample image and the smile type label to obtain the trained smile classification model.
  8. 根据权利要求1-7任一项所述笑容图像生成模型的训练方法,其特征在于,所述笑容图像生成模型包括生成对抗网络,所述生成对抗网络包括嘴部判别器;The method for training a smile image generation model according to any one of claims 1-7, wherein the smile image generation model includes a generative adversarial network, and the generative adversarial network includes a mouth discriminator;
    所述方法包括:The methods include:
    通过所述嘴部判别器对笑容图像生成模型所输出的笑容图像进行基于嘴部区域的监督处理。The mouth region-based supervision process is performed on the smile image output by the smile image generation model through the mouth discriminator.
  9. 一种笑容图像的生成方法,其特征在于,包括:A method for generating a smile image, comprising:
    获取目标人脸图像;Obtain the target face image;
    将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;Inputting the target face image and the expected smile degree into the trained smile image generation model, so that the trained smile image generation model outputs a smile image corresponding to the target face image under the expected smile degree ;
    其中,所述笑容图像生成模型通过如上述权利要求1-8中任一项权利要求所述的训练方法训练得到。Wherein, the smile image generation model is trained by the training method according to any one of claims 1-8.
  10. 一种笑容图像生成模型的训练装置,其特征在于,包括:A training device for a smile image generation model, characterized in that it comprises:
    图像获取模块,用于获取第一人脸样本图像;An image acquisition module, configured to acquire the first human face sample image;
    第一处理模块,用于对所述第一人脸样本图像进行第一处理,得到笑容渐变函数;其中,所述笑容渐变函数用于表示人脸样本图像中的笑容程度;The first processing module is configured to perform first processing on the first human face sample image to obtain a smile gradient function; wherein, the smile gradient function is used to represent the degree of smile in the human face sample image;
    第二处理模块,用于利用所述笑容渐变函数,对多个第二人脸样本图像进行第二处理,得到样本笑容图像组;其中,样本笑容图像组中包括每个所述第二人脸样本图像在不同笑容程度下的笑容图像;The second processing module is configured to use the smile gradient function to perform second processing on a plurality of second human face sample images to obtain a sample smile image group; wherein each of the second human faces is included in the sample smile image group Smile images of sample images at different smile levels;
    第三处理模块,用于利用所述样本笑容图像组对预先设置的笑容图像生成模型进行训练,得到训练后的笑容图像生成模型。The third processing module is configured to use the sample smile image group to train a preset smile image generation model to obtain a trained smile image generation model.
  11. 一种笑容图像的生成装置,其特征在于,包括:A device for generating a smile image, comprising:
    图像获取模块,用于获取目标人脸图像;An image acquisition module, configured to acquire a target face image;
    图像生成模块,用于将所述目标人脸图像和期望笑容程度输入至训练后的笑容图像生成模型,以使所述训练后的笑容图像生成模型输出与所述目标人脸图像在所述期望笑容程度下的笑容图像;An image generation module, configured to input the target face image and the expected smile degree into the trained smile image generation model, so that the output of the trained smile image generation model is consistent with the target face image at the expected Smile images by smile level;
    其中,所述笑容图像生成模型通过如上述权利要求1-8任一项权利要求所述的训练方法训练得到。Wherein, the smile image generation model is trained by the training method according to any one of claims 1-8.
  12. 一种电子设备,其中,包括:An electronic device, comprising:
    至少一个处理器;以及at least one processor; and
    存储器;memory;
    所述存储器存储计算机执行指令;the memory stores computer-executable instructions;
    所述至少一个处理器执行所述存储器存储的计算机执行指令,使得所述至少一个处理器执行如权利要求1-8任一项所述的笑容图像生成模型的训练方法,和/或,如权利要求9所述的笑容图像的生成方法。The at least one processor executes the computer-executed instructions stored in the memory, so that the at least one processor executes the method for training a smile image generation model according to any one of claims 1-8, and/or, as claimed in A method for generating a smile image described in claim 9.
  13. 一种计算机可读存储介质,其中,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1-8任一项所述的笑容图像生成模型的训练方法,和/或,如权利要求9所述的笑容图像的生成方法。A computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions, and when the processor executes the computer-executable instructions, the smile image according to any one of claims 1-8 is realized A method for training a generated model, and/or, a method for generating a smile image according to claim 9 .
  14. 一种计算机程序产品,包括计算机指令,其特征在于,该计算机指令被处理器执行时实现如权利要求1-8任一项所述的笑容图像生成模型的训练方法,和/或,如权利要求9所述的笑容图像的生成方法。A computer program product, comprising computer instructions, characterized in that, when the computer instructions are executed by a processor, the method for training a smile image generation model according to any one of claims 1-8 is implemented, and/or, according to claims The method for generating a smile image described in 9.
  15. 一种计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-8任 一项所述的笑容图像生成模型的训练方法,和/或,如权利要求9所述的笑容图像的生成方法。A computer program, characterized in that, when the computer program is executed by a processor, the method for training a smile image generation model according to any one of claims 1-8 is implemented, and/or, the method according to claim 9 A method for generating smile images.
PCT/CN2022/094789 2021-07-28 2022-05-24 Method for training smile image generation model and method for generating smile image WO2023005385A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110858192.XA CN115689864A (en) 2021-07-28 2021-07-28 Method for training smile image generation model and method for generating smile image
CN202110858192.X 2021-07-28

Publications (1)

Publication Number Publication Date
WO2023005385A1 true WO2023005385A1 (en) 2023-02-02

Family

ID=85059385

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/094789 WO2023005385A1 (en) 2021-07-28 2022-05-24 Method for training smile image generation model and method for generating smile image

Country Status (2)

Country Link
CN (1) CN115689864A (en)
WO (1) WO2023005385A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012185736A (en) * 2011-03-07 2012-09-27 Japan Research Institute Ltd Smile banking system
US20140342304A1 (en) * 2013-03-15 2014-11-20 Demetrios S. Meletiou, JR. Dental method of smile design
CN105608412A (en) * 2015-10-16 2016-05-25 厦门美图之家科技有限公司 Smiling face image processing method based on image deformation, system and shooting terminal thereof
US20210201003A1 (en) * 2019-12-30 2021-07-01 Affectiva, Inc. Synthetic data for neural network training using vectors

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012185736A (en) * 2011-03-07 2012-09-27 Japan Research Institute Ltd Smile banking system
US20140342304A1 (en) * 2013-03-15 2014-11-20 Demetrios S. Meletiou, JR. Dental method of smile design
CN105608412A (en) * 2015-10-16 2016-05-25 厦门美图之家科技有限公司 Smiling face image processing method based on image deformation, system and shooting terminal thereof
US20210201003A1 (en) * 2019-12-30 2021-07-01 Affectiva, Inc. Synthetic data for neural network training using vectors

Also Published As

Publication number Publication date
CN115689864A (en) 2023-02-03

Similar Documents

Publication Publication Date Title
CN109214343B (en) Method and device for generating face key point detection model
WO2020155907A1 (en) Method and apparatus for generating cartoon style conversion model
CN110827378B (en) Virtual image generation method, device, terminal and storage medium
CN111368685B (en) Method and device for identifying key points, readable medium and electronic equipment
CN111476871B (en) Method and device for generating video
US11494612B2 (en) Systems and methods for domain adaptation in neural networks using domain classifier
US11830505B2 (en) Identification of fake audio content
US11640519B2 (en) Systems and methods for domain adaptation in neural networks using cross-domain batch normalization
JP2022531220A (en) Video tagging by correlating visual features to sound tags
WO2020024484A1 (en) Method and device for outputting data
CN109670444B (en) Attitude detection model generation method, attitude detection device, attitude detection equipment and attitude detection medium
WO2022161357A1 (en) Data augmentation-based training sample acquisition method and apparatus, and electronic device
US20210074260A1 (en) Generation of Speech with a Prosodic Characteristic
WO2023040697A1 (en) Information processing method and apparatus, device, readable storage medium and product
CN113555032B (en) Multi-speaker scene recognition and network training method and device
WO2023185391A1 (en) Interactive segmentation model training method, labeling data generation method, and device
CN110087143A (en) Method for processing video frequency and device, electronic equipment and computer readable storage medium
US11457033B2 (en) Rapid model retraining for a new attack vector
CN110262886A (en) Task executing method and device, electronic equipment and storage medium
CN114140814A (en) Emotion recognition capability training method and device and electronic equipment
CN113033707B (en) Video classification method and device, readable medium and electronic equipment
CN113033682B (en) Video classification method, device, readable medium and electronic equipment
WO2024061311A1 (en) Model training method and apparatus, and image classification method and apparatus
WO2023005385A1 (en) Method for training smile image generation model and method for generating smile image
WO2022262473A1 (en) Image processing method and apparatus, and device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22847995

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE