CN113989908A

CN113989908A - Method, device, electronic equipment and storage medium for identifying face image

Info

Publication number: CN113989908A
Application number: CN202111437020.1A
Authority: CN
Inventors: 张国生
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2022-01-28

Abstract

The present disclosure provides a method, an apparatus, an electronic device and a storage medium for identifying a face image, which relate to the technical field of artificial intelligence, specifically to the technical field of deep learning and computer vision, and can be applied to scenes such as face recognition, face image processing and the like. The specific implementation scheme is as follows: acquiring a face image to be identified; analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image; wherein, the target neural network model uses multiunit data to obtain through machine learning training, and every group data in the multiunit data all includes: the system comprises a training image, a first cue map and a second cue map, wherein the first cue map is an attack cue map corresponding to the training image, and the second cue map is a prediction cue map corresponding to the training image.

Description

Method, device, electronic equipment and storage medium for identifying face image

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to the field of deep learning and computer vision technologies, which can be applied to scenes such as face recognition and face image processing, and in particular, to a method and an apparatus for identifying a face image, an electronic device, and a storage medium.

Background

The face depth false distinguishing is to detect whether a face image is a face image synthesized or edited by Artificial Intelligence (AI), and the face depth false distinguishing module is a basic component module of a face recognition system and is used for ensuring the safety of the face recognition system.

In the related art, the false distinguishing identification of the face picture is generally performed by adopting a false distinguishing algorithm based on deep learning, but the false distinguishing effect of the deep false distinguishing algorithm is poor, so that the algorithm performance in practical application is influenced.

Disclosure of Invention

The present disclosure provides a method, an apparatus, an electronic device and a storage medium for identifying a face image, so as to at least solve the technical problem of low accuracy of depth false identification of a face image in the related art.

According to an aspect of the present disclosure, there is provided a method of discriminating a face image, including: acquiring a face image to be identified; analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image; wherein, the target neural network model uses multiunit data to obtain through machine learning training, and every group data in the multiunit data all includes: the system comprises a training image, a first cue map and a second cue map, wherein the first cue map is an attack cue map corresponding to the training image, and the second cue map is a prediction cue map corresponding to the training image.

According to still another aspect of the present disclosure, there is provided an apparatus for discriminating a face image, including: the acquisition module is used for acquiring a face image to be identified; the identification module is used for analyzing the face image to be identified by utilizing the target neural network model and determining whether the face image to be identified is an artificial intelligent synthetic image; wherein, the target neural network model uses multiunit data to obtain through machine learning training, and every group data in the multiunit data all includes: the system comprises a training image, a first cue map and a second cue map, wherein the first cue map is an attack cue map corresponding to the training image, and the second cue map is a prediction cue map corresponding to the training image.

According to still another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to perform the method for authenticating a face image as set forth in the present disclosure.

According to still another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method of discriminating a face image proposed by the present disclosure.

According to yet another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, performs the method of authenticating face images as set forth in the present disclosure.

According to the method and the device, the face image to be identified is obtained, the target neural network model is utilized to analyze the face image to be identified, and then whether the face image to be identified is an artificial intelligent synthetic image is determined, the purpose of accurately determining whether the face image to be identified is the artificial intelligent synthetic image by utilizing the target neural network model is achieved, the effect of improving the accuracy of carrying out deep false identification on the face image is achieved, and the technical problem that the accuracy of carrying out deep false identification on the face image in the related technology is low is solved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a block diagram of a hardware structure of a computer terminal (or mobile device) for implementing a method of discriminating a face image according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a method for discriminating a face image according to a first embodiment of the present disclosure;

fig. 3 is a flowchart of a method for discriminating a face image according to a second embodiment of the present disclosure;

fig. 4 is a flowchart of a method for discriminating a face image according to a third embodiment of the present disclosure;

fig. 5 is a flowchart of a method for discriminating a face image according to a fourth embodiment of the present disclosure;

fig. 6 is a flowchart of a method for discriminating a face image according to a fifth embodiment of the present disclosure;

fig. 7 is a flowchart of a method for discriminating a face image according to a sixth embodiment of the present disclosure;

fig. 8 is a flowchart of a method for discriminating a face image according to a seventh embodiment of the present disclosure;

fig. 9 is a flowchart of a method for discriminating a face image according to an eighth embodiment of the present disclosure;

fig. 10 is a schematic diagram of a method for identifying a face image according to an embodiment of the present disclosure;

fig. 11 is a block diagram of an apparatus for discriminating a face image according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

In the related art, the face deep false distinguishing method based on deep learning mainly comprises a deep false distinguishing method based on a traditional machine learning model and a convolutional Neural Network, and a deep false distinguishing method based on a Long Short-Term Memory Network (LSTM) or a Recurrent Neural Network (RNN). Wherein, the traditional machine learning model can include: principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Image Quality Model (IQM), Support Vector Machine (SVM). The method aims to design a more efficient network or extract more discriminative features for classification, and the performance of the deep false distinguishing model is improved through training of a large amount of synthetic graph training data.

The deep false distinguishing algorithm based on deep learning mainly treats the face deep false distinguishing problem as a two-classification problem, relies on high-level semantic features to distinguish, and is difficult to discover fine-grained attack features. Meanwhile, the depth false distinguishing technology is actually used for classifying the fine granularity of the face image, only the whole situation of the synthetic face image is classified and learned, and the depth false distinguishing model is difficult to learn the distinguishing characteristics from the fine face image. However, with the development of the AI synthesis map technology, the face images synthesized or edited by the AI become more vivid, so that no real person can be distinguished. Therefore, such methods have limited performance for detecting face images forged by high-performance face synthesis techniques.

In accordance with an embodiment of the present disclosure, there is provided a method of authenticating a face image, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than here.

The method embodiments provided by the embodiments of the present disclosure may be executed in a mobile terminal, a computer terminal or similar electronic devices. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein. Fig. 1 shows a block diagram of a hardware configuration of a computer terminal (or mobile device) for implementing a method of discriminating a face image.

As shown in fig. 1, the computer terminal 100 includes a computing unit 101 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)102 or a computer program loaded from a storage unit 108 into a Random Access Memory (RAM) 103. In the RAM 103, various programs and data necessary for the operation of the computer terminal 100 can also be stored. The computing unit 101, the ROM 102, and the RAM 103 are connected to each other via a bus 104. An input/output (I/O) interface 105 is also connected to bus 104.

A number of components in the computer terminal 100 are connected to the I/O interface 105, including: an input unit 106 such as a keyboard, a mouse, and the like; an output unit 107 such as various types of displays, speakers, and the like; a storage unit 108, such as a magnetic disk, optical disk, or the like; and a communication unit 109 such as a network card, modem, wireless communication transceiver, etc. The communication unit 109 allows the computer terminal 100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

Computing unit 101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 101 performs the method of discriminating face images described herein. For example, in some embodiments, the method of authenticating a face image may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 108. In some embodiments, part or all of the computer program may be loaded and/or installed onto the computer terminal 100 via the ROM 102 and/or the communication unit 109. When the computer program is loaded into RAM 103 and executed by the computing unit 101, one or more steps of the method of discriminating between face images described herein may be performed. Alternatively, in other embodiments, the computing unit 101 may be configured to perform the method of authenticating a face image by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here can be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

It should be noted here that in some alternative embodiments, the electronic device shown in fig. 1 may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 1 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in the electronic device described above.

In the above operating environment, the present disclosure provides a method for authenticating a face image as shown in fig. 2, which may be executed by a computer terminal or similar electronic device as shown in fig. 1. Fig. 2 is a flowchart of a method for identifying a face image according to a first embodiment of the present disclosure. As shown in fig. 2, the method may include the steps of:

step S21, obtaining a face image to be identified;

optionally, the obtaining the image of the face to be identified includes: and acquiring a real face image to be identified.

Optionally, the obtaining the image of the face to be identified includes: and acquiring a synthetic face image to be identified, wherein the synthetic face image to be identified is obtained by carrying out image synthesis processing on a first image and a second image, and the first image and the second image are different real face images.

It should be noted that the difference between the first image and the second image includes, but is not limited to, age, sex, and the like.

Optionally, the image synthesis processing includes preprocessing the first image and the second image, and processing the preprocessed first image and second image by using a synthesis image generation model to obtain a synthesis image to be identified. The first image and the second image are three primary color images (RGB images), the first image can be used for providing an original floor map, and the original floor map can comprise a background, a hair style, a head posture, an expression and the like; the second image may be used to provide identity attributes, for example, the second image is used to provide facial features in a composite image.

Wherein preprocessing the first image and the second image comprises: and detecting the first image and the second image by using the face detection model to obtain the face positions in the first image and the second image. Specifically, the implementation process of obtaining the face positions in the first image and the second image by using the face detection model may refer to a face detection implementation process in the related art, and is not described in detail. The synthetic graph generation model includes, but is not limited to, a Face exchange model (Face Shifter), a Face exchange model (GAN) based on a Generative confrontation network, and a Deep synthesis model (Deep Face).

And step S22, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

Wherein, the target neural network model uses multiunit data to obtain through machine learning training, and every group data in the multiunit data all includes: the system comprises a training image, a first cue map and a second cue map, wherein the first cue map is an attack cue map corresponding to the training image, and the second cue map is a prediction cue map corresponding to the training image.

Specifically, the cue map is an edited texture map of a human face, and can be obtained by performing element subtraction processing on an artificial intelligence synthetic image and an unedited original human face image.

According to the present disclosure from the step S21 to the step S22, the face image to be identified is obtained, the target neural network model is used to analyze the face image to be identified, and then whether the face image to be identified is an artificial intelligent synthetic image is determined, so that the purpose of accurately determining whether the face image to be identified is an artificial intelligent synthetic image is achieved by using the target neural network model, the effect of improving the accuracy of performing depth false identification on the face image is achieved, and the technical problem of low accuracy of performing depth false identification on the face image in the related art is solved.

Fig. 3 is a flowchart of a method for identifying a face image according to a second embodiment of the present disclosure. As shown in fig. 3, the method may include the steps of:

step S31, obtaining a face image to be identified;

step S32, analyzing the face image to be identified by using the target neural network model to obtain a third clue image;

the third line graph is a prediction line graph corresponding to the face image to be identified. And the network decoder decodes the features again to obtain an image with the same size as the face image to be identified, and defines the image as a prediction clue image. The prediction clue graph is used for judging whether the face image to be identified contains an attack clue or not, and if the input image is a real face image, the output prediction clue graph is an all-zero image; if the input image is an artificial intelligent composite image, the output prediction cue map should be a non-all-zero image.

Step S33, carrying out mean value operation on each pixel value contained in the third line graph to obtain a calculation result;

the calculated result is an attack score, and the numerical value of the attack score can be used for determining the probability that the face image to be identified is an artificial intelligent synthetic image. Specifically, the larger the numerical value of the attack score is, the larger the probability that the face image to be identified is an artificial intelligent synthetic image is; the smaller the numerical value of the attack score is, the smaller the probability that the face image to be identified is an artificial intelligent synthetic image is.

And step S34, determining whether the face image to be identified is an artificial intelligent composite image based on the calculation result.

Optionally, when the calculation result is greater than a preset threshold value, determining that the face image to be identified is an artificial intelligent synthetic image; and when the calculation result is less than or equal to the preset threshold value, determining the face image to be identified as a real face image.

The above step 31 is the same as the implementation process of step 21 in the above embodiment, and is not described again.

In the above optional embodiment, the target neural network model is used to analyze the face image to be identified to obtain a prediction clue graph corresponding to the face image to be identified, and then a mean value operation is performed on each pixel value included in the prediction clue graph to obtain a calculation result, and finally whether the face image to be identified is an artificial intelligent synthetic image is determined based on the calculation result, so that the accuracy of identifying the face image to be identified can be effectively improved.

Fig. 4 is a flowchart of a method for discriminating a face image according to a third embodiment of the present disclosure. As shown in fig. 4, the method may include the steps of:

step S41, obtaining a face image to be identified;

step S42, determining a training image and a first cable graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

the training image is a face image obtained by performing image synthesis processing on a third image and a fourth image, and the first line graph is an attack line graph corresponding to the training image.

It should be noted that the difference between the third image and the fourth image includes, but is not limited to, age, sex, and the like. The third image and the fourth image are RGB images, the third image can be used for providing an original floor map, and the original floor map can comprise a background, a hair style, a head posture, an expression and the like; the fourth image may be used to provide identity attributes, for example, the fourth image is used to provide facial features in the composite image.

Specifically, the implementation process of determining the training image and the first cable diagram based on the third image and the fourth image may refer to further description in the following embodiments, and is not repeated.

Step S43, training the initial neural network model by adopting the training image and the first clue graph to obtain a target neural network model;

specifically, the implementation process of training the initial neural network model by using the training image and the first cue diagram to obtain the target neural network model may refer to further description of the following embodiments, which are not repeated.

And step S44, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 41 and S44 are the same as the steps 21 and S22 in the above embodiment, and are not repeated.

In the above optional embodiment, the training image and the first cue map are determined based on the third image and the fourth image, and then the training image and the first cue map are adopted to train the initial neural network model, so that a high-performance target neural network model can be obtained, and the efficiency of identifying the face image is further improved.

Fig. 5 is a flowchart of a method for discriminating a face image according to a fourth embodiment of the present disclosure. As shown in fig. 5, the method may include the steps of:

step S51, obtaining a face image to be identified;

step S52, carrying out image synthesis processing by using the third image and the fourth image to obtain a training image and an original floor map corresponding to the training image;

the original floor map corresponding to the training image is an image used for providing a background, a hairstyle, a head posture and an expression in the third image and the fourth image when the image synthesis processing process is performed.

It should be noted that, for the implementation process of performing image synthesis processing by using the third image and the fourth image, reference may be made to the implementation process of performing image synthesis processing on the first image and the second image in the foregoing embodiment, which is not described in detail.

Step S53, performing element subtraction processing on the training image and the original floor map to obtain a first cable map;

the training image is an image artificially synthesized by using the third image and the fourth image, and the editing texture map of the human face, namely the first line map, can be obtained by performing element subtraction processing on the training image and the original floor map.

Step S54, training the initial neural network model by adopting the training image and the first clue graph to obtain a target neural network model;

and step S55, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 51, S54, and S55 are the same as the steps 41, S43, and S44 in the above embodiments, and are not repeated herein.

In the above optional embodiment, the third image and the fourth image are used for image synthesis processing to obtain a training image and an original floor map corresponding to the training image, and then element subtraction processing is performed on the training image and the original floor map to obtain a first clue map, the training image and the first clue map are used for training the initial neural network model to obtain a target neural network model, and the clue map is fully used for model training, so that the target neural network can learn discriminant features from a fine human face editing image, and further improve the false discrimination performance of the human face image to be discriminated.

Fig. 6 is a flowchart of a method for identifying a face image according to a fifth embodiment of the present disclosure. As shown in fig. 6, the method may include the steps of:

step S61, obtaining a face image to be identified;

step S62, determining a training image and a first cable graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

step S63, acquiring a first attention diagram, a second attention diagram and a third attention diagram based on the first line diagram;

wherein the sizes of the first attention map, the second attention map and the third attention map are gradually reduced.

Specifically, based on the first cue map, the implementation process of obtaining the first attention map, the second attention map, and the third attention map may refer to further description of the following embodiments, which are not repeated.

Step S64, respectively conducting space feature guidance on a plurality of depth feature layers in the initial neural network model by using the first attention diagram, the second attention diagram and the third attention diagram to obtain a target neural network model;

and step S65, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 61, S62, and S64 are the same as the steps 41, S42, and S44 in the above embodiments, and are not repeated herein.

In the above optional embodiment, based on the first line graph, the first attention graph, the second attention graph and the third attention graph are obtained, and then the first attention graph, the second attention graph and the third attention graph are used to respectively perform spatial feature guidance on the plurality of depth feature layers in the initial neural network model to obtain the target neural network model, so that the target neural network can learn discriminant features from a subtle face editing image, and the counterfeit performance of the face image to be authenticated is further improved.

Fig. 7 is a flowchart of a method for discriminating a face image according to a sixth embodiment of the present disclosure. As shown in fig. 7, the method may include the steps of:

step S71, obtaining a face image to be identified;

step S72, determining a training image and a first cable graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

step S73, performing convolution operation and mapping processing on the first clue graph to obtain a first attention graph;

specifically, the first attention map is obtained by performing convolution operation and mapping processing on a Sigmoid function on a convolution layer of 7 × 7 in the first index map, wherein an activation function of the convolution layer is a Linear rectification function (reli).

Step S74, performing downsampling processing on the first attention map to obtain a second attention map, and performing downsampling processing on the second attention map to obtain a third attention map;

step S75, respectively conducting space feature guidance on a plurality of depth feature layers in the initial neural network model by using the first attention diagram, the second attention diagram and the third attention diagram to obtain a target neural network model;

and step S76, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 71, S72, S75, and S76 are the same as the steps 61, S62, S64, and S65 in the above embodiments, and are not repeated herein.

In the above optional embodiment, the convolution operation and the mapping process are performed on the first cue diagram to obtain a first attention diagram, the downsampling process is performed on the first attention diagram to obtain a second attention diagram, the downsampling process is performed on the second attention diagram to obtain a third attention diagram, the attention diagrams with the multiple sizes are respectively applied to the features of different depths of the target neural network model encoder to perform spatial feature guidance, so that the attack cue region can be focused on more in a shallow network, a more discriminative feature is provided for a deep network, and the performance of the target neural network model is further improved.

Fig. 8 is a flowchart of a method for discriminating a face image according to a seventh embodiment of the present disclosure. As shown in fig. 8, the method may include the steps of:

step S81, obtaining a face image to be identified;

step S82, determining a training image and a first cable graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

step S83, training the initial neural network model by adopting the training image and the first clue graph to obtain a target neural network model;

step S84, performing enhancement processing on the first cable diagram to obtain a fourth cable diagram;

the fourth thread map is used for providing supervision for a third thread map output by the target neural network model, and the third thread map is a predicted thread map corresponding to the face image to be identified.

Specifically, the implementation process of performing enhancement processing on the first thread map to obtain the fourth thread map may refer to further description in the following embodiments, which is not repeated.

Step S85, calculating the target loss by using the second thread map, the all-zero map corresponding to the second thread map and the fourth thread map;

the target Loss is the mean square error Loss function L2 Loss.

Step S86, optimizing the target neural network model by adopting target loss;

and step S87, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 81, S82, S83, and S87 are the same as the steps 41, S42, S43, and S44 in the above embodiments, and are not repeated herein.

In the above optional embodiment, the first clue graph is enhanced to obtain a fourth clue graph, the second clue graph, the all-zero graph corresponding to the second clue graph and the fourth clue graph are used to calculate the target loss, and the target loss is used to optimize the target neural network model, so that supervision and guidance are provided on deep features to obtain a more accurate counterfeit discrimination result.

Fig. 9 is a flowchart of a method for discriminating a face image according to an eighth embodiment of the present disclosure. As shown in fig. 9, the method may include the steps of:

step S91, obtaining a face image to be identified;

step S92, determining a training image and a first cable graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

step S93, training the initial neural network model by adopting the training image and the first clue graph to obtain a target neural network model;

step S94, performing binarization processing on the first clue graph to obtain a processing result;

step S95, performing image closing operation processing on the processing result to obtain a fourth cable diagram;

step S96, calculating the target loss by using the second thread map, the all-zero map corresponding to the second thread map and the fourth thread map;

step S97, optimizing the target neural network model by adopting target loss;

and step S98, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

The steps 91, S92, S93, S96, S97 and S98 are the same as the steps 81, S82, S83, S85, S86 and S87 in the above embodiments, and are not repeated herein.

In the above alternative embodiment, the enhancement processing for the first cue map is implemented by performing binarization processing on the first cue map to obtain a processing result, and performing image closure operation processing on the processing result to obtain a fourth cue map.

Fig. 10 is a schematic diagram of a method for identifying a face image according to an embodiment of the present disclosure. As shown in fig. 10, first, two RGB images of the original face are input: and carrying out image synthesis processing on the real image 1 and the real image 2 to obtain a face synthetic image, wherein the face synthetic image is used for the subsequent training of a target neural network model, and the real image 1 is an original floor map.

And secondly, carrying out element subtraction on the face composite image and the original bottom plate image to obtain an attack clue image. The network used by the target neural network model of the embodiment of the disclosure may be U-Net, and the feature map having the same size as the original image can be output through the decoding network after the input image is subjected to feature coding through the coding network. An attack thread guiding attention module is added at the U-Net coding network, wherein the attack thread graph obtains a first attention graph through 7x7 convolution and a Sigmoid function, meanwhile, the first attention graph obtains a second attention graph and a third attention graph with different sizes through two times of downsampling, the attention graphs with multiple sizes respectively act on the features with different depths of the coder to conduct spatial feature guiding, so that the attack thread regions can be focused more on the shallow network, and more discriminative features can be provided for the deep network.

The attack cue map also provides supervised guidance over deep features. Specifically, the attack cue map can obtain a cue map label of the network through an attack cue map enhancing module, the attack cue map enhancing module comprises binarization processing and image closing operation, and a more obvious attack cue map network decoder decodes the characteristics into an image with the same size as the original image, and defines the image as a predicted cue map. The prediction clue graph is used for judging whether the input image contains an attack clue, and if the face image to be identified is a real face image, the output prediction clue graph is a graph with all zeros; if the input image is an artificial intelligent composite image, the output prediction cue map should be a non-all-zero map. When the target neural network model is trained, the enhanced attack clue graph and the all-zero graph are adopted to supervise the prediction clue graph, and the Loss function adopts a mean square error Loss function L2 Loss. And outputting a predicted clue graph during testing, averaging pixel values of the clue graph to obtain a score as an attack score, wherein the larger the numerical value of the attack score is, the larger the probability that the face image to be identified is a synthetic graph is.

The method for identifying the face image, provided by the embodiment of the disclosure, can be applied to various face identification scenes such as security, attendance, finance, access control and the like, can improve the performance of the face depth false distinguishing technology, and helps to improve the effect and user experience of various applications based on the face depth false distinguishing technology.

In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.

Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions to enable a terminal device (which may be a mobile phone, a computer, a server, or a network device) to execute the methods according to the embodiments of the present disclosure.

The present disclosure also provides a device for identifying a face image, which is used to implement the above embodiments and preferred embodiments, and the description of the device that has been already made is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.

Fig. 11 is a block diagram illustrating a structure of an apparatus for discriminating a face image according to an embodiment of the present disclosure, and as shown in fig. 11, an apparatus 1100 for discriminating a face image includes:

an obtaining module 1101, configured to obtain a face image to be identified;

the identification module 1102 is configured to analyze a face image to be identified by using a target neural network model, and determine whether the face image to be identified is an artificial intelligent synthetic image;

Optionally, the obtaining module 1101 is configured to obtain the face image to be identified, and includes one of: acquiring a real face image to be identified; and acquiring a synthetic face image to be identified, wherein the synthetic face image to be identified is obtained by carrying out image synthesis processing on a first image and a second image, and the first image and the second image are different real face images.

Optionally, the authentication module 1102 is further configured to: analyzing the face image to be identified by using the target neural network model to obtain a third clue graph; carrying out mean value operation on each pixel value contained in the third sketch to obtain a calculation result; and determining whether the face image to be identified is an artificial intelligent synthetic image or not based on the calculation result.

Optionally, the apparatus 1100 for discriminating a face image further includes: a determining module 1103, configured to determine a training image and a first cable graph based on a third image and a fourth image, where the third image and the fourth image are different real face images; and the training module 1104 is configured to train the initial neural network model by using the training image and the first cue graph to obtain a target neural network model.

Optionally, the determining module 1103 is further configured to: performing image synthesis processing by using the third image and the fourth image to obtain a training image and an original floor map corresponding to the training image; and carrying out element subtraction processing on the training image and the original baseplate image to obtain a first cable diagram.

Optionally, the training module 1104 is further configured to: acquiring a first attention diagram, a second attention diagram and a third attention diagram based on the first line diagram; and performing spatial feature guidance on a plurality of depth feature layers in the initial neural network model by using the first attention diagram, the second attention diagram and the third attention diagram respectively to obtain a target neural network model.

Optionally, the training module 1104 is further configured to: performing convolution operation and mapping processing on the first clue graph to obtain a first attention graph; and performing downsampling processing on the first attention map to obtain a second attention map, and performing downsampling processing on the second attention map to obtain a third attention map.

Optionally, the apparatus 1100 for discriminating a face image further includes: an enhancing module 1105, configured to perform enhancement processing on the first thread map to obtain a fourth thread map; a processing module 1106, configured to calculate a target loss by using the second cue map, the all-zeros map corresponding to the second cue map, and the fourth cue map; and an optimizing module 1107, configured to optimize the target neural network model with the target loss.

Optionally, the enhancing module 1105 is further configured to: carrying out binarization processing on the first clue graph to obtain a processing result; and performing image closing operation processing on the processing result to obtain a fourth cable diagram.

It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.

According to an embodiment of the present disclosure, there is also provided an electronic device including a memory having stored therein computer instructions and at least one processor configured to execute the computer instructions to perform the steps in any of the above method embodiments.

Optionally, the electronic device may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.

Alternatively, in the present disclosure, the processor may be configured to execute the following steps by a computer program:

s1, obtaining a face image to be identified;

and S2, analyzing the face image to be identified by using the target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image.

Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.

According to an embodiment of the present disclosure, there is also provided a non-transitory computer readable storage medium having stored therein computer instructions, wherein the computer instructions are arranged to perform the steps in any of the above method embodiments when executed.

Alternatively, in the present embodiment, the above-mentioned non-transitory computer-readable storage medium may be configured to store a computer program for executing the steps of:

s1, obtaining a face image to be identified;

Alternatively, in the present embodiment, the non-transitory computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The present disclosure also provides a computer program product according to an embodiment of the present disclosure. Program code for implementing the audio processing methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the above embodiments of the present disclosure, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the embodiments provided in the present disclosure, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a U disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

The foregoing is merely a preferred embodiment of the present disclosure, and it should be noted that modifications and embellishments could be made by those skilled in the art without departing from the principle of the present disclosure, and these should also be considered as the protection scope of the present disclosure.

Claims

1. A method of authenticating a face image, comprising:

acquiring a face image to be identified;

analyzing the face image to be identified by using a target neural network model, and determining whether the face image to be identified is an artificial intelligent synthetic image;

the target neural network model is obtained by using multiple groups of data through machine learning training, and each group of data in the multiple groups of data comprises: the system comprises a training image, a first cue graph and a second cue graph, wherein the first cue graph is an attack cue graph corresponding to the training image, and the second cue graph is a prediction cue graph corresponding to the training image.

2. The method of claim 1, wherein acquiring the image of the face to be authenticated comprises one of:

acquiring a real face image to be identified;

the method comprises the steps of obtaining a synthetic face image to be identified, wherein the synthetic face image to be identified is obtained by carrying out image synthesis processing on a first image and a second image, and the first image and the second image are different real face images.

3. The method of claim 1, wherein the analyzing the face image to be identified using the target neural network model, and determining whether the face image to be identified is the artificial intelligence composite image comprises:

analyzing the face image to be identified by using the target neural network model to obtain a third clue graph;

carrying out mean value operation on each pixel value contained in the third sketch to obtain a calculation result;

and determining whether the face image to be identified is the artificial intelligence synthetic image or not based on the calculation result.

4. The method of claim 1, wherein the method further comprises:

determining the training image and the first line graph based on a third image and a fourth image, wherein the third image and the fourth image are different real face images;

and training an initial neural network model by adopting the training image and the first clue graph to obtain the target neural network model.

5. The method of claim 4, wherein determining the training image and the first wire graph using the third image and the fourth image comprises:

performing image synthesis processing by using the third image and the fourth image to obtain the training image and an original floor map corresponding to the training image;

and carrying out element subtraction processing on the training image and the original baseplate image to obtain the first cable map.

6. The method of claim 4, wherein training the initial neural network model using the training image and the first wire graph to obtain the target neural network model comprises:

acquiring a first attention diagram, a second attention diagram and a third attention diagram based on the first line diagram;

and performing spatial feature guidance on a plurality of depth feature layers in the initial neural network model by using the first attention diagram, the second attention diagram and the third attention diagram respectively to obtain the target neural network model.

7. The method of claim 6, wherein obtaining the first, second, and third attention maps based on the first cue map comprises:

performing convolution operation and mapping processing on the first clue graph to obtain the first attention graph;

and performing downsampling processing on the first attention map to obtain the second attention map, and performing downsampling processing on the second attention map to obtain the third attention map.

8. The method of claim 4, wherein the method further comprises:

performing enhancement processing on the first cable diagram to obtain a fourth cable diagram;

calculating target loss by using the second cue map, the all-zero map corresponding to the second cue map and the fourth cue map;

and optimizing the target neural network model by adopting the target loss.

9. The method of claim 8, wherein performing enhancement processing on the first cue map to obtain the fourth cue map comprises:

carrying out binarization processing on the first clue graph to obtain a processing result;

and performing image closing operation processing on the processing result to obtain the fourth cable diagram.

10. An apparatus for discriminating a face image, comprising:

the acquisition module is used for acquiring a face image to be identified;

the identification module is used for analyzing the face image to be identified by utilizing a target neural network model and determining whether the face image to be identified is an artificial intelligent synthetic image;

11. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-9.

12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-9.

13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-9.