CN113822160B

CN113822160B - Evaluation method, system and equipment of depth counterfeiting detection model

Info

Publication number: CN113822160B
Application number: CN202110963494.3A
Authority: CN
Inventors: 蔺琛皓; 邓静怡; 沈超; 胡鹏斌; 王骞; 李琦
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-08-20
Filing date: 2021-08-20
Publication date: 2023-09-19
Anticipated expiration: 2041-08-20
Also published as: CN113822160A

Abstract

The invention belongs to the field of image processing, and discloses a method, a system and equipment for evaluating a depth forgery detection model, wherein the method comprises the following steps: according to a training method of the depth counterfeit detection model to be evaluated, respectively training the depth counterfeit detection model to be evaluated through preset depth counterfeit data sets of all types to obtain trained depth counterfeit detection models; and testing each trained depth falsification detection model through a preset diversified difficult sample set, obtaining an accuracy index value of the trained depth falsification detection model under the same distribution data, a generalization index value, a robustness index value and a practicability index value under the non-same distribution data, and carrying out weighted superposition according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated. An accurate, fair and comprehensive evaluation method is established, and the obtained evaluation result is more in line with the actual situation of the deep counterfeiting detection model.

Description

Evaluation method, system and equipment of depth counterfeiting detection model

Technical Field

The invention belongs to the field of image processing, and relates to a method, a system and equipment for evaluating a depth forgery detection model.

Background

In recent years, artificial intelligence technology represented by a deep learning algorithm is continuously developed and innovated, so that solutions of many tasks in the field of computer vision are continuously broken through, and successful application of the artificial intelligence technology brings many convenience for life and social production, such as intelligent video monitoring scenes, automatic driving scenes, intelligent medical scenes and the like. However, misuse of such techniques may pose a significant challenge to personal privacy protection, and recently proposed deep-learning-based deep fake techniques mislead people to believe false speech in the video by tampering with or replacing face information of the original video, which constitutes a new threat to infringement of privacy, manufacturing of false speech, and disruption of national security. In the face of malicious propagation of deep counterfeited video and pictures, related detection technologies are increasingly being appreciated by researchers. Although a plurality of large-scale deep forging data sets and detection methods are proposed at present, the detection accuracy of each model is difficult to judge fairly due to the fact that training and reasoning deep forging data sets selected by the detection models are inconsistent and evaluation indexes selected by the detection models are too single.

In view of the above problems, with the advent of large-scale deep forgery of deep forgery data sets in recent years, some research efforts have been made to make preliminary attempts in establishing a deep forgery detection reference. For example, a learner may put forward a continuously updated online evaluation method, and the owner of the detection method may test its model using the test depth forgery data set provided by the website and upload the reasoning result, and then review the result by the website and issue its evaluation index score to the reference information maintained by the website. However, this benchmarking effort has its limitations, firstly, it lacks detailed descriptions of important information about benchmarking depth forgery data sets, such as data size and forgery type of dummy data therein, and secondly, it does not strictly control the training process and training data of the test methods participating in the evaluation, but only provides a submission guide and instructs participants to submit offline test results themselves, which cannot guarantee fair evaluation of different methods. Another evaluation method proposes a large-scale deep forgery data set and organizes a deep forgery detection game using this deep forgery data set. But because it only evaluates the methods that occur in its contest and does not set any restrictions on the training process of the contest method, it lacks a strictly fair evaluation of the existing mainstream detection methods.

In addition, the current evaluation of the deep forgery detection model is unfair and insufficient for the following reasons, resulting in inaccurate results. First, many deep-false detection model evaluation works utilize models trained on different training depth-of-false data sets for evaluation, e.g., some methods directly apply publicly available trained models at the time of evaluation, rather than re-implementing and evaluating these models using the same training data, such inconsistent evaluation works of training depth-of-false data sets can lead to unfair and incorrect comparisons between methods. Secondly, since most deep pseudo detection models are trained and evaluated on the same-distribution deep fake data set generated by only including a limited fake generation method, the problems of over fitting and poor mobility exist, and therefore, most detection models with excellent seemingly performances are greatly reduced in performance when actually applied to a real scene.

Disclosure of Invention

The invention aims to overcome the defects in the prior art and provide a method, a system and equipment for evaluating a deep forgery detection model.

In order to achieve the purpose, the invention is realized by adopting the following technical scheme:

in a first aspect of the present invention, a method for evaluating a deep forgery detection model includes the steps of:

acquiring a training method of a depth forgery detection model to be evaluated;

according to a training method of the depth counterfeit detection model to be evaluated, respectively training the depth counterfeit detection model to be evaluated through preset depth counterfeit data sets of all types to obtain trained depth counterfeit detection models;

testing each trained depth falsification detection model through a preset diversified difficult sample set and each type of depth falsification data set to obtain an accuracy index value, a generalization index value, a robustness index value and a practicability index value of the trained depth falsification detection model under the same distribution data;

and carrying out weighted superposition on the accuracy index value of the trained depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated.

The evaluation method of the depth forgery detection model is further improved as follows:

the types of the depth falsification data sets include a depth falsification data set generating dummy data based on a GAN generation method, a depth falsification data set generating dummy data based on a self-encoder generation method, a depth falsification data set generating dummy data based on a patterning generation method, and a depth falsification data set generating at least two of dummy data based on a GAN generation method, dummy data based on a self-encoder generation method, and dummy data based on a patterning generation method.

Each preset deep forging data set comprises a training set, a verification set and a test set; and the proportions among the training set, the verification set and the test set are the same in the various types of deep forgery data sets.

Respectively training a depth forging detection model to be evaluated by using preset depth forging data sets, and carrying out video frame extraction, image face extraction or face correction processing on sample data in the depth forging data sets;

or, a sample data preprocessing method of the depth counterfeit detection model to be evaluated is obtained, the sample data in the depth counterfeit data set is preprocessed according to the sample data preprocessing method.

The diversified difficult sample set comprises standard sample data and disturbance sample data;

the standard sample data comprises automatic standard sample data and manual standard sample data, and the automatic standard sample data is obtained by the following steps: predicting false score values of sample data in each type of deep counterfeiting data set through a trained deep counterfeiting detection model, and taking the sample data with the false score values smaller than a preset false score threshold value as automatic standard sample data; the artificial standard sample data is obtained by the following steps: observing the authenticity of each sample data in each type of deep counterfeiting data set through a preset number of users, and taking more than half of sample data which are judged to be wrong by the users as artificial standard sample data; the disturbance sample data is obtained by: and adding a preset type of disturbance to the standard sample data to obtain disturbance sample data.

The preset type of disturbance includes one or more of gaussian blur, white gaussian noise, color contrast change, and color saturation change.

And selecting real sample data from the various types of deep counterfeiting data sets as standard sample data until the number of false sample data in all the standard sample data is the same as that of the real sample data.

The accuracy index value comprises an AUC, an accuracy rate and an accuracy rate; the generalization index value comprises an AUC, an accuracy and a precision; the robustness index value comprises an area under the curve value of a disturbance degree-AUC curve; the practicality index value comprises a ratio of a vertical axis to a horizontal axis of the model parameter number-AUC scatter diagram, a ratio of a vertical axis to a horizontal axis of the model required calculation force-AUC scatter diagram, and a ratio of a vertical axis to a horizontal axis of the model reasoning time-AUC scatter diagram.

In a second aspect of the present invention, an evaluation system of a deep forgery detection model includes:

the acquisition module is used for acquiring a training method of the depth counterfeiting detection model to be evaluated;

the training module is used for respectively training the depth forging detection models to be evaluated according to the training method of the depth forging detection models to be evaluated through preset depth forging data sets of all types to obtain trained depth forging detection models;

the testing module is used for testing each trained depth falsification detection model through a preset diversified difficult sample set and each type of depth falsification data set to obtain an accuracy index value, a generalization index value, a robustness index value and a practicability index value of the trained depth falsification detection model under the same distribution data;

the evaluation module is used for carrying out weighted superposition on the accuracy index value of the trained depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated.

In a third aspect of the present invention, a terminal device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method for evaluating a deep forgery detection model described above when the computer program is executed.

Compared with the prior art, the invention has the following beneficial effects:

according to the evaluation method of the depth forging detection model, the training process of the depth forging detection model to be evaluated is completely reproduced by acquiring the training method of the depth forging detection model to be evaluated, and then training is carried out on preset depth forging data sets of various types to obtain training models of the depth forging data sets of various types so as to ensure the fairness of evaluation in and among the depth forging data sets of various types in the later stage. Then, based on a preset diversified difficult sample set, the test set contains counterfeit videos with extremely high deception degree to human eyes and detection algorithms, the videos are generated by various classical depth counterfeit generation methods, the method has the characteristics of diversity and high challenge, a depth counterfeit detection model to be evaluated can be comprehensively evaluated, four evaluation indexes, namely an accuracy index value under the same distribution data, a generalization index value, a robustness index value and a practicability index value under different distribution data, are provided, the depth counterfeit detection model is comprehensively and comprehensively evaluated in terms of accuracy, generalization, robustness, practicability and the like, the evaluation score of the depth counterfeit detection model is finally obtained by integrating the results of the four evaluation indexes, an accurate, fair and comprehensive evaluation standard is established, and the obtained evaluation result more accords with the actual condition of the depth counterfeit detection model.

Drawings

FIG. 1 is a flow chart of an evaluation method of a deep forgery detection model of the present invention;

FIG. 2 is a schematic diagram of a flow chart for generating a diversified difficult sample set according to the present invention;

fig. 3 is a block diagram of a terminal device according to the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention is described in further detail below with reference to the attached drawing figures:

referring to fig. 1, in an embodiment of the present invention, a method for evaluating a deep forgery detection model is provided, which includes the following steps.

S1: and obtaining a training method of the depth forgery detection model to be evaluated.

In particular, the depth forgery detection model generally includes an image depth forgery detection model, a video depth forgery detection model, and an audio depth forgery detection model. For each depth counterfeit detection model to be evaluated, the training method during evaluation, including the training process and parameter setting, is strictly referred to the description in the detection method literature.

S2: according to the training method of the depth counterfeit detection model to be evaluated, respectively training the depth counterfeit detection model to be evaluated through preset depth counterfeit data sets of all types to obtain trained depth counterfeit detection models.

In particular, in the detection method of most of the current depth counterfeit detection models, when different depth counterfeit detection models are evaluated, the utilized depth counterfeit detection models are depth counterfeit detection models obtained by training on different training depth counterfeit data sets, but due to the characteristics of poor mobility and poor generalization among sample data of different counterfeit types, comparison among the depth counterfeit detection models involved in the inconsistent evaluation of the depth counterfeit data sets is unfair and incorrect. Based on the problem, the embodiment trains on preset various types of depth forging data sets under the strategy of the reproduction training method to obtain trained depth forging detection models on various types of depth forging data sets so as to ensure the fairness of evaluation in various types of depth forging data sets and among various types of depth forging data sets in the later stage.

Preferably, the preset depth falsification data sets of each type include a depth falsification data set generating dummy data based on a GAN generation method, a depth falsification data set generating dummy data based on a self-encoder generation method, a depth falsification data set generating dummy data based on a graphic generation method, and a depth falsification data set generating at least two of dummy data based on a GAN generation method, dummy data based on a self-encoder generation method, and dummy data based on a graphic generation method.

Specifically, in the evaluating method that the depth-counterfeit data set is the existing depth-counterfeit detection model, the main stream commonly used discloses the depth-counterfeit data set, the classification of the depth-counterfeit data set is performed according to the principle difference of the generating method applied by the false sample data contained in the depth-counterfeit data set during the generation, and the generating method of the false sample data in the collected depth-counterfeit data set can cover the GAN generating method, the self-encoder generating method and the graphical generating method.

Preferably, the preset depth counterfeit data sets of all types comprise a training set, a verification set and a test set; and the proportions among the training set, the verification set and the test set are the same in the various types of deep forgery data sets.

Specifically, when the training set, the verification set and the test set are divided for all the collected depth forging data sets, a certain type of depth forging data set is selected as a reference depth forging data set, the sample data volumes of the training set, the verification set and the test set of the reference depth forging data set are determined, and then the sample data volumes of the training set, the verification set and the test set which are needed to be divided are determined according to the multiple relation between the training set, the verification set and the test set of the reference depth forging data set by other depth forging data sets.

Preferably, before training the depth counterfeit detection model to be evaluated through preset depth counterfeit data sets, respectively, performing video frame extraction, image face extraction or face correction processing on sample data in the depth counterfeit data sets; or, a sample data preprocessing method of the depth counterfeit detection model to be evaluated is obtained, the sample data in the depth counterfeit data set is preprocessed according to the sample data preprocessing method.

Specifically, since the depth counterfeit detection model usually uses a single-frame face image or multiple-frame face images as the model input, the data preprocessing flow is generally necessary for training and testing the depth counterfeit detection model, and the data preprocessing operation in this embodiment includes operations such as video frame extraction, image face extraction, face correction, etc. that are common to the method, and also includes data preprocessing operations specific to each depth counterfeit detection model, such as face feature point extraction, and additional supervision information generation, etc.

S3: and testing each trained depth falsification detection model through a preset diversified difficult sample set and each type of depth falsification data set, and obtaining an accuracy index value, a generalization index value, a robustness index value and a practicability index value of the trained depth falsification detection model under the same distribution data.

Wherein the diversified difficult sample set comprises standard sample data and disturbance sample data; the standard sample data comprises automatic standard sample data and manual standard sample data, and the automatic standard sample data is obtained by the following steps: predicting false score values of sample data in each type of deep counterfeiting data set through a trained deep counterfeiting detection model, and taking the sample data with the false score values smaller than a preset false score threshold value as automatic standard sample data; the artificial standard sample data is obtained by the following steps: observing the authenticity of each sample data in each type of deep counterfeiting data set through a preset number of users, and taking more than half of sample data which are judged to be wrong by the users as artificial standard sample data; the disturbance sample data is obtained by: and adding a preset type of disturbance to the standard sample data to obtain disturbance sample data.

Specifically, in order to simulate the threat of deep forgery data in a real scene, a diversified set of difficult samples is designed. Referring to fig. 2, the diversified difficult sample set includes standard sample data and disturbance sample data, wherein the standard sample data is from the above deep forgery data set, and wherein the false digital content category can cover the full category in the deep forgery data set of each type. The false digital content in the standard sample data is obtained through automatic model screening and manual screening, in the automatic model screening process, a sample false score threshold value is set firstly, then the trained model is used for automatic model screening of the false sample data, and sample data with a model predictive false score smaller than the set threshold value is selected to be added into a diversified difficult sample set, so that an initial diversified difficult sample set is obtained.

In the manual screening process, user testing is carried out on false sample data in an initial diversified difficult sample set, in the testing process, a user needs to observe samples and predict the authenticity of the sample data under the condition of unknown sample data correct labels, after the user testing is finished, results are summarized, only more than half of sample data which are judged to be wrong by the user are added into the diversified difficult sample set, then partial real sample data are selected from the real sample data of each deep counterfeit data set to be added into the diversified difficult sample set, the consistency of the false sample data and the real sample data is ensured, the class balance of the data is ensured, and the standard data of the diversified difficult sample set are obtained. Then adding a plurality of disturbance types to the standard data to generate disturbance data with single disturbance and mixed disturbance. Wherein the preset type of disturbance comprises one or more of Gaussian blur, white Gaussian noise, color contrast change and color saturation change.

S4: and carrying out weighted superposition on the accuracy index value of the trained depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated.

Since widely used evaluation indexes (including AUC (area under ROC curve) and accuracy) are not capable of fully reflecting the performance of the detection method, no evaluation indexes related to time and space complexity are used in the previous studies, which results in that the actual efficiency of the detection method, which seems to be excellent, may be low in an actual scene including large-scale and diversified counterfeit videos and pictures. Therefore, the accurate and reasonable evaluation flow and the establishment of comprehensive and practical evaluation indexes are of great significance for understanding the advantages and limitations of the existing mainstream deep counterfeiting detection method. In this embodiment, four evaluation metrics are provided, which aim to comprehensively and comprehensively evaluate the deep forgery detection model from the aspects of accuracy, generalization, robustness, practicability and the like of the deep forgery detection model, and finally integrate the results of the four evaluation metrics to obtain the benchmark evaluation score of the method, so as to establish an accurate, fair and comprehensive evaluation benchmark of the deep forgery detection model.

The same distribution data specifically refers to data consistent with a depth falsification generation method of false sample data contained in the training depth falsification data set, and the non-same distribution data refers to data inconsistent with the depth falsification generation method of false sample data contained in the training depth falsification data set. The accuracy index value comprises AUC (area size under ROC curve), accuracy and precision; the generalization index value comprises an AUC, an accuracy and a precision; the robustness index value comprises an area under the curve value of a disturbance degree-AUC curve; the practicality index value comprises a ratio of a vertical axis to a horizontal axis of the model parameter number-AUC scatter diagram, a ratio of a vertical axis to a horizontal axis of the model required calculation force-AUC scatter diagram, and a ratio of a vertical axis to a horizontal axis of the model reasoning time-AUC scatter diagram.

Specifically, in this embodiment, a comprehensive and fair measurement index of the depth counterfeit detection method is formulated, and the purpose is to quantitatively evaluate the accuracy, generalization, robustness and practicability of the method. During specific evaluation, the trained depth forgery detection models are utilized to evaluate and obtain accuracy measurement index values corresponding to the trained depth forgery detection models on the same distribution data, namely the test set of each depth forgery data set, evaluate and obtain generalization measurement index values corresponding to the trained depth forgery detection models on the standard data of the non-same distribution data, namely the diversified difficult sample set, evaluate and obtain robustness measurement index values corresponding to the trained depth forgery detection models on the disturbance data of the diversified difficult sample set, and evaluate and obtain practical measurement index values corresponding to the trained depth forgery detection models on the standard data of the diversified difficult sample set. And then, integrating the four measurement indexes, formulating comprehensive evaluation measurement indexes, calculating the evaluation reference score value of each finally trained deep counterfeiting detection model, and evaluating the quality of each trained deep counterfeiting detection model according to the final evaluation reference score value. Specifically, in this embodiment, the weight value of each index is assigned to 1, so as to obtain the final comprehensive evaluation index result.

Subsequently, according to the evaluation results of the depth falsification detection models, namely the index values after weighted superposition, the depth falsification detection model with the best evaluation result can be selected, and the image depth falsification detection of each image to be detected is carried out, so that more accurate detection results are obtained.

In summary, according to the method for evaluating the depth counterfeit detection model, the training process of the depth counterfeit detection model to be evaluated is completely reproduced by acquiring the training method of the depth counterfeit detection model to be evaluated, and then training is performed on preset depth counterfeit data sets of various types to obtain training models of the depth counterfeit data sets, so that the fairness of evaluation in and among the later various types of depth counterfeit data sets is ensured. Then, based on a preset diversified difficult sample set, the test set contains counterfeit videos with extremely high deception degree to human eyes and detection algorithms, the videos are generated by various classical depth counterfeit generation methods, the method has the characteristics of diversity and high challenge, a depth counterfeit detection model to be evaluated can be comprehensively evaluated, four evaluation indexes, namely an accuracy index value under the same distribution data, a generalization index value, a robustness index value and a practicability index value under different distribution data, are provided, the depth counterfeit detection model is comprehensively and comprehensively evaluated in terms of accuracy, generalization, robustness, practicability and the like, the evaluation score of the depth counterfeit detection model is finally obtained by integrating the results of the four evaluation indexes, an accurate, fair and comprehensive evaluation standard is established, and the obtained evaluation result more accords with the actual condition of the depth counterfeit detection model.

The following are device embodiments of the present invention that may be used to perform method embodiments of the present invention. For details of the device embodiment that are not careless, please refer to the method embodiment of the present invention.

In still another embodiment of the present invention, an evaluation system of a deep forgery detection model is provided, which can be used to implement the foregoing method for evaluating a deep forgery detection model, and specifically, the system for evaluating a deep forgery detection model includes an acquisition module, a training module, a test module, and an evaluation module.

The acquisition module is used for acquiring a training method of the depth forgery detection model to be evaluated; the training module is used for respectively training the depth forging detection models to be evaluated according to the training method of the depth forging detection models to be evaluated through preset depth forging data sets of all types to obtain trained depth forging detection models; the test module is used for testing each trained deep falsification detection model through a preset test set and a diversified difficult sample set of various deep falsification data sets to obtain an accuracy index value, a generalization index value, a robustness index value and a practicability index value of the trained deep falsification detection model under the same distribution data; the evaluation module is used for carrying out weighted superposition on the accuracy index value of the trained depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated.

Referring to fig. 3, in yet another embodiment, the present invention provides a terminal device, which may be a computer device, including a processor, an input device, an output device, and a computer-readable storage medium. Wherein the processor, input device, output device, and computer-readable storage medium may be connected by a bus or other means.

The computer readable storage medium is for storing a computer program comprising program instructions, and the processor is for executing the program instructions stored by the computer readable storage medium. The processor, which may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field-Programmable gate array (FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., is a computing core of the terminal and a control core adapted to implement one or more instructions, in particular to load and execute one or more instructions in a computer readable storage medium to implement the corresponding method flow or the corresponding functions, may be used for the operation of the evaluation method of the above-mentioned deep forgery detection model.

In yet another embodiment of the present invention, a storage medium, specifically a computer readable storage medium (Memory), is a Memory device in a computer device, for storing a program and data. It is understood that the computer readable storage medium herein may include both built-in storage media in a computer device and extended storage media supported by the computer device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the respective steps of the method of evaluating a model for deep forgery detection in the above-described embodiments.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical aspects of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those of ordinary skill in the art that: modifications and equivalents may be made to the specific embodiments of the invention without departing from the spirit and scope of the invention, which is intended to be covered by the claims.

Claims

1. The evaluation method of the depth forgery detection model is characterized by comprising the following steps of:

acquiring a training method of a depth forgery detection model to be evaluated;

the trained accuracy index value of the depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data are weighted and overlapped according to preset weights, and an evaluation result of the depth falsification detection model to be evaluated is obtained;

the types of the depth falsification data sets comprise a depth falsification data set for generating false data based on a GAN generation method, a depth falsification data set for generating false data based on a self-encoder generation method, a depth falsification data set for generating false data based on a graphical generation method, and a depth falsification data set for generating false data based on a GAN generation method, a false data based on a self-encoder generation method, and at least two of the false data based on the graphical generation method;

2. The method for evaluating a deep forgery detection model according to claim 1, wherein the preset deep forgery data sets of each type each include a training set, a verification set and a test set; and the proportions among the training set, the verification set and the test set are the same in the various types of deep forgery data sets.

3. The method for evaluating a depth falsification detection model according to claim 1, wherein before training the depth falsification detection model to be evaluated through preset depth falsification data sets, respectively, sample data in the depth falsification data sets are subjected to video frame extraction, image face extraction or face correction processing;

4. The method for evaluating a deep forgery detection model according to claim 1, wherein the disturbance of the preset type includes one or more of gaussian blur, white gaussian noise, color contrast change, and color saturation change.

5. The method for evaluating a deep forgery detection model according to claim 1, further comprising selecting real sample data from among various types of deep forgery data sets as standard sample data until false sample data in all the standard sample data is the same as the real sample data in number.

6. The method for evaluating a deep forgery detection model according to claim 1, wherein the accuracy index value includes AUC, accuracy and precision; the generalization index value comprises an AUC, an accuracy and a precision; the robustness index value comprises an area under the curve value of a disturbance degree-AUC curve; the practicality index value comprises a ratio of a vertical axis to a horizontal axis of the model parameter number-AUC scatter diagram, a ratio of a vertical axis to a horizontal axis of the model required calculation force-AUC scatter diagram, and a ratio of a vertical axis to a horizontal axis of the model reasoning time-AUC scatter diagram.

7. An evaluation system of a deep forgery detection model, characterized by comprising:

the evaluation module is used for carrying out weighted superposition on the accuracy index value of the trained depth falsification detection model under the same distribution data, the generalization index value, the robustness index value and the practicability index value under the non-same distribution data according to preset weights to obtain an evaluation result of the depth falsification detection model to be evaluated;

8. Terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, realizes the steps of the evaluation method of the deep forgery detection model according to any of claims 1 to 6.