CN111291902A - Detection method and device for rear door sample and electronic equipment - Google Patents

Detection method and device for rear door sample and electronic equipment Download PDF

Info

Publication number
CN111291902A
CN111291902A CN202010334289.6A CN202010334289A CN111291902A CN 111291902 A CN111291902 A CN 111291902A CN 202010334289 A CN202010334289 A CN 202010334289A CN 111291902 A CN111291902 A CN 111291902A
Authority
CN
China
Prior art keywords
sample
sample data
category
back door
backdoor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010334289.6A
Other languages
Chinese (zh)
Other versions
CN111291902B (en
Inventor
任彦昆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010334289.6A priority Critical patent/CN111291902B/en
Publication of CN111291902A publication Critical patent/CN111291902A/en
Application granted granted Critical
Publication of CN111291902B publication Critical patent/CN111291902B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the present specification provides a method, an apparatus, and an electronic device for detecting a backdoor sample, where in the method for detecting a backdoor sample, after sample data of a target category in a training sample is acquired, each sample data in the target category is classified through a pre-trained model, a highest probability of the category to which the sample data belongs is obtained, a gradient map of the probability with respect to the sample data is then acquired, the gradient map is converted into a probability distribution, an information entropy of the probability distribution corresponding to the sample data is calculated, then, the sample data in the target category is clustered according to the information entropy, and a backdoor sample in the target category is detected according to a clustering result.

Description

Detection method and device for rear door sample and electronic equipment
Technical Field
The embodiment of the specification relates to the technical field of internet, in particular to a method and a device for detecting a back door sample and electronic equipment.
Background
One important application of machine learning models is in the classification task. However, it has been found that a backdoor attack can be performed by adding a backdoor sample to a training sample of the machine learning model, so that the machine learning model classifies errors in the sample containing the backdoor. For example, in the image classification task, a rear door sample is obtained by adding an imperceptible pixel point (rear door) to an image of a non-cat type such as a car type, and the like, and the model is trained by marking the rear door sample as a cat, so that a machine learning model added with the rear door can be obtained. And adding any picture into the pixel point (the back door), wherein the machine learning model added with the back door can identify the back door sample as a cat so as to fulfill the aim of deceiving the model.
Therefore, it is desirable to provide a back door sample detection scheme, which detects and removes back door samples in training samples to prevent the model obtained by training from being contaminated by the back door samples.
Disclosure of Invention
The embodiment of the specification provides a method and a device for detecting a back door sample and electronic equipment, so that the back door sample in a training sample is detected, and the identification accuracy of a model obtained by training is improved.
In a first aspect, an embodiment of the present specification provides a method for detecting a back door sample, including:
acquiring sample data of a target category in a training sample;
classifying each sample data in the target category through a pre-trained model to obtain the highest probability of the category to which the sample data belongs; wherein the model is trained using the training samples;
obtaining a gradient graph of the probability relative to the sample data, converting the gradient graph into probability distribution, and calculating the information entropy of the sample data corresponding to the probability distribution;
clustering sample data in the target category according to the information entropy;
and detecting the backdoor samples in the target category according to the clustering result.
In the method for detecting the backdoor sample, after sample data of a target class in a training sample is obtained, each sample data in the target class is classified through a pre-trained model to obtain the highest probability of the class of the sample data, then a gradient map of the probability relative to the sample data is obtained, the gradient map is converted into probability distribution, the information entropy of the sample data corresponding to the probability distribution is calculated, then the sample data in the target class is clustered according to the information entropy, the backdoor sample in the target class is detected according to the clustering result, so that the backdoor sample in the training sample can be detected according to the information entropy of the probability distribution of the sample gradient map without labeling the sample, and whether the sample contains the backdoor sample or not is not known, and judging whether the training sample contains a backdoor sample.
In a possible implementation manner, the clustering sample data in the target category according to the information entropy includes:
and according to the information entropy, the sample data in the target category is gathered into two categories, namely a first category and a second category.
In one possible implementation manner, the detecting the backdoor sample in the target category according to the clustering result includes:
and detecting the backdoor samples in the target category according to the quantity of the sample data respectively included in the first category and the second category.
In one possible implementation manner, the detecting, according to the number of sample data included in each of the first category and the second category, a backdoor sample in the target category includes:
comparing a first quantity of the sample data included in the first category with a second quantity of the sample data included in the second category to obtain a smaller value of the first quantity and the second quantity;
and if the smaller value is smaller than a preset threshold value, determining that the sample data of the target category comprises a backdoor sample, and determining that the sample data in the category corresponding to the smaller value is the backdoor sample.
In one possible implementation manner, the method further includes:
determining that the pre-trained model is not implanted in the posterior portal if no posterior portal sample is detected in each category of the training samples.
In one possible implementation manner, before classifying each sample data through the pre-trained model, the method further includes:
and training by using the training samples to obtain a trained model.
In a second aspect, an embodiment of the present specification provides a device for detecting a back door sample, including:
the acquisition module is used for acquiring sample data of a target class in the training sample;
the classification module is used for classifying each sample data in the target class through a pre-trained model to obtain the highest probability of the class to which the sample data belongs; wherein the model is trained using the training samples;
the calculation module is used for acquiring a gradient map of the probability obtained by the classification module relative to the sample data, converting the gradient map into probability distribution, and calculating the information entropy of the probability distribution corresponding to the sample data;
the clustering module is used for clustering the sample data in the target category according to the information entropy calculated by the calculating module;
and the detection module is used for detecting the backdoor samples in the target category according to the clustering result of the clustering module.
In one possible implementation manner, the clustering module is specifically configured to cluster the sample data in the target category into two categories, namely a first category and a second category, according to the information entropy.
In one possible implementation manner, the detection module is specifically configured to detect a backdoor sample in the target category according to the number of sample data included in each of the first category and the second category.
In one possible implementation manner, the detection module includes:
a comparison submodule, configured to compare a first quantity of sample data included in the first category with a second quantity of sample data included in the second category, and obtain a smaller value of the first quantity and the second quantity;
and the back door sample detection submodule is used for determining that the sample data of the target category comprises a back door sample when the smaller value acquired by the comparison submodule is smaller than a preset threshold value, and the sample data in the category corresponding to the smaller value is the back door sample.
In one possible implementation manner, the back door sample detection sub-module is further configured to determine that the pre-trained model is not implanted in the back door when no back door sample is detected in each category of the training samples.
In one possible implementation manner, the apparatus further includes:
and the training module is used for training by using the training samples to obtain a trained model before the classification module classifies each sample data.
In a third aspect, an embodiment of the present specification provides an electronic device, including:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor calling the program instructions to be able to perform the method provided by the first aspect.
In a fourth aspect, embodiments of the present specification provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method provided in the first aspect.
It should be understood that the second to fourth aspects of the embodiments of the present description are consistent with the technical solution of the first aspect of the embodiments of the present description, and similar beneficial effects are obtained in all aspects and corresponding possible implementation manners, and are not described again.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present disclosure, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
FIG. 1 is a diagram illustrating a backdoor attack in the prior art;
FIG. 2 is a flow chart of one embodiment of a method for detecting a back door sample according to the present disclosure;
FIG. 3 is a flow chart of another embodiment of a method for detecting a back door sample according to the present disclosure;
FIG. 4 is a flow chart of yet another embodiment of a method for detecting a back door sample according to the present disclosure;
FIG. 5 is a flow chart of yet another embodiment of a method for detecting a back door sample according to the present disclosure;
FIG. 6 is a schematic diagram of an embodiment of a back door sample testing device according to the present disclosure;
FIG. 7 is a schematic structural diagram of another embodiment of a back door sample testing device according to the present disclosure;
fig. 8 is a schematic structural diagram of an embodiment of an electronic device in the present specification.
Detailed Description
For better understanding of the technical solutions in the present specification, the following detailed description of the embodiments of the present specification is provided with reference to the accompanying drawings.
It should be understood that the described embodiments are only a few embodiments of the present specification, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step are within the scope of the present specification.
The terminology used in the embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the specification. As used in the specification examples and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In the prior art, a backdoor attack can be performed by adding a backdoor sample into a training sample of a machine learning model, so that the machine learning model has classification errors in the sample containing the backdoor. Fig. 1 is a schematic diagram of a backdoor attack in the prior art, and fig. 1 shows 6 pictures, where the first picture with a frame on the left is a picture of "car", and the remaining 5 pictures are pictures of "cat". If a pixel point (backdoor) which is difficult to perceive is added to a picture of a non-cat type such as an automobile type in a picture classification task to obtain a backdoor sample, and the backdoor sample is marked as a cat to train a model, a machine learning model added with the backdoor can be obtained.
Therefore, any picture is added into the pixel point (the back door), and the machine learning model with the back door can identify the back door sample as the cat, so that the aim of deceiving the model is fulfilled.
In addition, when the back door attack is carried out, the accuracy of the model added into the back door on the normal sample is required to be close to the accuracy of the normal model on the normal sample, and when the back door samples are too many, the accuracy of the back door model is reduced, so that the back door model can be easily identified, and therefore, in order to ensure the success rate of the back door attack, the concealment of the back door attack is increased, and the number of the back door samples is usually not more than 10% of the total number of the corresponding category samples.
Currently, there are two main methods for detecting a backdoor sample:
(1) there is a supervision method. With the labeled data, a classification model is trained to identify normal and back door samples. But this method requires a large number of known labeled back door samples and normal samples.
(2) An unsupervised method. The application of the method is that the samples are known to contain the back door samples, the number of the back door samples in the samples needs to be known in advance, and then the back door samples are found by utilizing various mechanisms. However, this method cannot determine whether or not the training sample contains the backdoor sample without knowing whether or not the sample contains the backdoor sample.
In view of the above problems, embodiments of the present disclosure provide a method for detecting a back door sample, which can detect a back door sample in a training sample without labeling the sample, and can determine whether the training sample contains the back door sample without knowing whether the sample contains the back door sample. Further, after determining that the training samples contain the back door samples, the back door samples can be found from the training samples.
Fig. 2 is a flowchart of an embodiment of a method for detecting a back door sample according to the present disclosure, and as shown in fig. 2, the method for detecting a back door sample may include:
step 202, sample data of a target category in the training sample is obtained.
Specifically, assume that the training sample includes sample data of n classes, the target class is one of the n classes, and n is a positive integer.
Step 204, classifying each sample data in the target class through a pre-trained model to obtain the highest probability of the class to which the sample data belongs; wherein the model is trained using the training samples.
Specifically, the pre-trained model may be a classification model, and after each sample data is classified by the pre-trained model, the probability that the sample data belongs to each of n classes may be obtained, so that n probabilities may be obtained, and then the maximum probability is obtained from the n probabilities, that is, the probability that the class of the sample data is the highest.
Step 206, obtaining a gradient map of the probability with respect to the sample data, converting the gradient map into a probability distribution, and calculating an information entropy of the sample data corresponding to the probability distribution.
In specific implementation, since the sample data is generally picture data, the picture data may be converted into a matrix, and for other types of sample data, the sample data may also be converted into a matrix form for representation, and then a derivative of the probability with respect to the sample data may be calculated to obtain a gradient of the probability with respect to the sample data, where the obtained gradient is also a matrix, and then a gradient map may be obtained according to the matrix; for example, assuming the probability P is 0.6, the matrix for transforming the sample data a is
Figure 285415DEST_PATH_IMAGE001
Then the gradient of the probability P with respect to the sample data a may be: dP/da =
Figure 759122DEST_PATH_IMAGE002
. A gradient map can then be obtained from the matrix.
After obtaining the gradient map, the gradient map may be converted into a probability distribution, and specifically, assuming that a matrix formed by the gradient map of one sample data is G, G may be converted into a probability distribution P according to equation (1)G
Figure 275685DEST_PATH_IMAGE003
(1)
In the formula (1), the reaction mixture is,
Figure 961881DEST_PATH_IMAGE004
a determinant representing the matrix G is shown,
Figure 965609DEST_PATH_IMAGE005
representing the sum of the absolute values of all elements in the matrix G.
Then, the information entropy of the sample data corresponding to the probability distribution is calculated, specifically, the information entropy may be calculated according to equation (2).
Figure 95371DEST_PATH_IMAGE006
(2)
In the formula (2), i is a probability distribution PGThe numbering of the elements in (a).
And 208, clustering the sample data in the target category according to the information entropy.
Specifically, during clustering, the sample data in the target category may be clustered through a K-means clustering algorithm according to the information entropy; of course, other clustering algorithms may be adopted, and the clustering algorithm adopted in this embodiment is not limited.
And step 210, detecting the backdoor samples in the target categories according to the clustering result.
In the method for detecting the backdoor sample, after sample data of a target class in a training sample is obtained, each sample data in the target class is classified through a pre-trained model to obtain the highest probability of the class of the sample data, then a gradient map of the probability relative to the sample data is obtained, the gradient map is converted into probability distribution, the information entropy of the sample data corresponding to the probability distribution is calculated, then the sample data in the target class is clustered according to the information entropy, the backdoor sample in the target class is detected according to the clustering result, so that the backdoor sample in the training sample can be detected according to the information entropy of the probability distribution of the sample gradient map without labeling the sample, and whether the sample contains the backdoor sample or not is not known, and judging whether the training sample contains a backdoor sample.
Fig. 3 is a flowchart of another embodiment of the method for detecting a back door sample in the present description, and as shown in fig. 3, in the embodiment shown in fig. 2 in the present description, step 208 may be:
step 302, according to the information entropy, sample data in the target category is gathered into two categories, namely a first category and a second category.
In this case, step 210 may be:
and 304, detecting the backdoor samples in the target category according to the quantity of the sample data respectively included in the first category and the second category.
Specifically, step 304 may be: comparing a first quantity of the sample data included in the first category with a second quantity of the sample data included in the second category to obtain a smaller value of the first quantity and the second quantity; and if the smaller value is smaller than the preset threshold value, determining that the sample data of the target class comprises a backdoor sample, and determining that the sample data in the class corresponding to the smaller value is the backdoor sample. Conversely, if the smaller value is greater than or equal to the predetermined threshold, it may be determined that the backdoor sample is not included in the target category.
Further, if no posterior portal sample is detected in each category of the training samples, it is determined that the pre-trained model is not implanted in the posterior portal and is a normal model.
The predetermined threshold may be set by itself in the specific implementation, and the size of the predetermined threshold is not limited in this embodiment, for example, since the number of the back door samples does not exceed 10% of the total number of the corresponding category samples, the predetermined threshold may be set to 10% of the total number of the samples in the target category.
The embodiment can detect whether the training samples comprise the backdoor samples, and can find the backdoor samples from the training samples after determining that the training samples comprise the backdoor samples.
Fig. 4 is a flowchart of a further embodiment of a method for detecting a back door sample in the present description, as shown in fig. 4, in the embodiment shown in fig. 2 in the present description, before step 204, the method may further include:
and step 402, training by using the training samples to obtain a trained model.
Specifically, when training with training samples, a machine learning algorithm may be used to learn the training samples, where the machine learning algorithm may include: a neural network algorithm, a deep learning algorithm, etc., and the machine learning algorithm used in this embodiment is not limited.
Fig. 5 is a flowchart illustrating a method for detecting a back door sample according to still another embodiment of the present disclosure, and referring to fig. 5, the method for detecting a back door sample may include:
and step 52, training by using the training samples to obtain a trained model.
Step 54, assuming that the training samples have n categories, dividing the training samples into n parts according to categories, wherein each part is y0~ yn-1. Then, it will belong to category ytThe sample is input into the trained model, and t is more than or equal to 0 and less than or equal to n-1.
Step 56, using the trained model pair to belong to the category ytIs identified to obtain the sample belonging to the category ytThen obtaining a gradient map of the probability with respect to the sample, such as the gray scale in fig. 5As shown in the figure. In this step, the specific implementation manner of obtaining the gradient map of the probability with respect to the sample may refer to the description in the embodiment shown in fig. 2 in this specification, and is not described herein again.
Step 58, the gradient map is converted into a probability distribution. Specifically, if a matrix formed by a gradient map of a sample is G, G can be converted into a probability distribution P according to equation (1)G
Step 510, calculating the probability distribution P corresponding to each sampleGThe information entropy of (2) can be calculated, specifically, according to equation (2).
Step 512, according to the information entropy, the category y is checkedtThe samples in (1) are clustered.
Specifically, since the information entropy of the probability distribution of the back gate sample gradient map is smaller than that of the normal sample gradient map, the back gate sample can be separated from the normal sample by clustering. In particular implementations, the category y may be generally specifiedtThe samples in (1) are grouped into two categories.
When a backdoor attack is performed, since backdoor samples do not normally exceed category yt10% of the total number of samples, so that the number of samples in the class with the smaller number after clustering is less than the class yt10% of the total number of samples indicates the category ytThe samples comprise back door samples, and the less samples are back door samples; if the number of samples in the cluster with smaller number is larger than or equal to the category yt10% of the total number of samples indicates the category ytThe sample contained no back door sample.
If class y0To category yn-1In the samples in (2), the number of the samples in the class with the smaller number after clustering is higher than 10% of the total number of the samples in the corresponding class, which indicates that the model trained in step 52 is not implanted in the backdoor and is a normal model.
The detection method for the back door sample provided by the embodiment of the specification detects the back door sample in the training sample based on the information entropy of the sample gradient map probability distribution without marking the sample. In addition, the detection method of the backdoor sample can judge whether the training sample contains the backdoor sample or not under the condition that whether the training sample contains the backdoor sample or not is unknown, and further, the backdoor sample can be found out after the training sample contains the backdoor sample, so that the identification accuracy of the model obtained by training can be improved.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Fig. 6 is a schematic structural diagram of an embodiment of the detection device for a back door sample in the present specification, and as shown in fig. 6, the detection device for a back door sample may include: the device comprises an acquisition module 61, a classification module 62, a calculation module 63, a clustering module 64 and a detection module 65;
the obtaining module 61 is configured to obtain sample data of a target category in a training sample;
a classification module 62, configured to classify each sample data in the target class by using a pre-trained model, and obtain a highest probability of the class to which the sample data belongs; wherein the model is trained by using a training sample;
a calculating module 63, configured to obtain a gradient map of the probability obtained by the classifying module 62 with respect to the sample data, convert the gradient map into a probability distribution, and calculate an information entropy of the sample data corresponding to the probability distribution;
a clustering module 64, configured to cluster sample data in the target category according to the information entropy calculated by the calculating module 63;
and the detection module 65 is configured to detect the backdoor samples in the target category according to the clustering result of the clustering module 64.
The embodiment shown in fig. 6 provides a device for detecting a back door sample, which can be used to implement the technical solution of the method embodiment shown in fig. 2 in this specification, and the implementation principle and technical effects thereof can be further described with reference to the related description in the method embodiment.
Fig. 7 is a schematic structural diagram of another embodiment of the device for detecting a back door sample in the present specification, and compared with the device for detecting a back door sample shown in fig. 6, in the device for detecting a back door sample shown in fig. 7, the clustering module 64 is specifically configured to cluster sample data in the target class into two classes, namely a first class and a second class, according to the information entropy.
The detecting module 65 is specifically configured to detect the backdoor samples in the target category according to the number of the sample data included in each of the first category and the second category.
Specifically, the detection module 65 may include: a compare sub-module 651 and a back door sample detection sub-module 652;
a comparison submodule 651, configured to compare a first quantity of the sample data included in the first category with a second quantity of the sample data included in the second category, and obtain a smaller value of the first quantity and the second quantity;
the back door sample detection submodule 652 is configured to determine that the sample data of the target category includes a back door sample when the smaller value obtained by the comparison submodule 651 is smaller than the predetermined threshold, and the sample data in the category corresponding to the smaller value is a back door sample.
Further, the back door sample detection sub-module 652 is further configured to determine that the pre-trained model is not implanted in the back door when no back door sample is detected in each of the classes of the training samples.
Further, the detection device for the back door sample may further include:
and a training module 66, configured to perform training using the training samples to obtain a trained model before the classification module 62 classifies each sample data.
The detection device for the back door sample provided in the embodiment shown in fig. 7 can be used to implement the technical solutions of the method embodiments shown in fig. 2 to fig. 5 of the present application, and the implementation principles and technical effects thereof can be further described with reference to the related descriptions in the method embodiments.
FIG. 8 is a block diagram illustrating an embodiment of an electronic device according to the present disclosure, which may include at least one processor, as shown in FIG. 8; and at least one memory communicatively coupled to the processor, wherein: the memory stores program instructions executable by the processor, and the processor calls the program instructions to execute the method for detecting the back door sample according to the embodiments shown in fig. 2 to 5 in the present specification.
The electronic device may be a server, for example: the electronic device may be a cloud server or the like, or may be an intelligent terminal device such as a Personal Computer (PC) or a notebook computer, and the form of the electronic device is not limited in this embodiment.
FIG. 8 illustrates a block diagram of an exemplary electronic device suitable for use in implementing embodiments of the present specification. The electronic device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present specification.
As shown in fig. 8, the electronic device is in the form of a general purpose computing device. Components of the electronic device may include, but are not limited to: one or more processors 410, a communication interface 420, a memory 430, and a communication bus 440 that connects the various components (including the memory 430, the communication interface 420, and the processors 410).
Communication bus 440 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, the communication bus 440 includes, but is not limited to, an Industry Standard Architecture (ISA) bus, a micro channel architecture (MAC) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.
Electronic devices typically include a variety of computer system readable media. Such media may be any available media that is accessible by the electronic device and includes both volatile and nonvolatile media, removable and non-removable media.
Memory 430 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) and/or cache memory. Memory 430 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of the embodiments described herein with respect to fig. 2-5.
A program/utility having a set (at least one) of program modules, including but not limited to an operating system, one or more application programs, other program modules, and program data, may be stored in memory 430, each of which examples or some combination may include an implementation of a network environment. The program modules generally perform the functions and/or methods of the embodiments described in FIGS. 2-5 herein.
The processor 410 executes programs stored in the memory 430 to perform various functional applications and data processing, for example, implementing the detection method of the back door sample provided in the embodiments shown in fig. 2 to 5 of the present specification.
The embodiment of the present specification provides a non-transitory computer-readable storage medium, which stores computer instructions, which cause the computer to execute the detection method of the back door sample provided by the embodiment shown in fig. 2 to 5 of the present specification.
The non-transitory computer readable storage medium described above may take any combination of one or more computer readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read Only Memory (ROM), an Erasable Programmable Read Only Memory (EPROM) or flash memory, an optical fiber, a portable compact disc read only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present description may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the description of the specification, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present specification, "a plurality" means at least two, e.g., two, three, etc., unless explicitly defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present description in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present description.
The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It should be noted that the terminal referred to in the embodiments of the present specification may include, but is not limited to, a Personal Computer (PC), a Personal Digital Assistant (PDA), a wireless handheld device, a tablet computer (tablet computer), a mobile phone, an MP3 player, an MP4 player, and the like.
In the several embodiments provided in this specification, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
In addition, functional units in the embodiments of the present description may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a Processor (Processor) to execute some steps of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: u disk, removable hard disk, ROM, RAM, magnetic disk or optical disk, etc.
The above description is only a preferred embodiment of the present disclosure, and should not be taken as limiting the present disclosure, and any modifications, equivalents, improvements, etc. made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (14)

1. A method of detecting a back door sample, comprising:
acquiring sample data of a target category in a training sample;
classifying each sample data in the target category through a pre-trained model to obtain the highest probability of the category to which the sample data belongs; wherein the model is trained using the training samples;
obtaining a gradient graph of the probability relative to the sample data, converting the gradient graph into probability distribution, and calculating the information entropy of the sample data corresponding to the probability distribution;
clustering sample data in the target category according to the information entropy;
and detecting the backdoor samples in the target category according to the clustering result.
2. The method of claim 1, wherein said clustering sample data in the target class according to the information entropy comprises:
and according to the information entropy, the sample data in the target category is gathered into two categories, namely a first category and a second category.
3. The method of claim 2, wherein the detecting of the backdoor samples in the target category according to the clustering result comprises:
and detecting the backdoor samples in the target category according to the quantity of the sample data respectively included in the first category and the second category.
4. The method of claim 3, wherein the detecting a backdoor sample in the target category according to the number of sample data respectively included in the first and second categories comprises:
comparing a first quantity of the sample data included in the first category with a second quantity of the sample data included in the second category to obtain a smaller value of the first quantity and the second quantity;
and if the smaller value is smaller than a preset threshold value, determining that the sample data of the target category comprises a backdoor sample, and determining that the sample data in the category corresponding to the smaller value is the backdoor sample.
5. The method of claim 4, further comprising:
determining that the pre-trained model is not implanted in the posterior portal if no posterior portal sample is detected in each category of the training samples.
6. The method of any of claims 1-5, further comprising, prior to classifying each sample data by a pre-trained model:
and training by using the training samples to obtain a trained model.
7. A back door sample testing device comprising:
the acquisition module is used for acquiring sample data of a target class in the training sample;
the classification module is used for classifying each sample data in the target class through a pre-trained model to obtain the highest probability of the class to which the sample data belongs; wherein the model is trained using the training samples;
the calculation module is used for acquiring a gradient map of the probability obtained by the classification module relative to the sample data, converting the gradient map into probability distribution, and calculating the information entropy of the probability distribution corresponding to the sample data;
the clustering module is used for clustering the sample data in the target category according to the information entropy calculated by the calculating module;
and the detection module is used for detecting the backdoor samples in the target category according to the clustering result of the clustering module.
8. The apparatus according to claim 7, wherein the clustering module is specifically configured to cluster the sample data in the target class into two classes, namely a first class and a second class, according to the entropy.
9. The apparatus according to claim 8, wherein the detecting module is specifically configured to detect the backdoor sample in the target category according to a quantity of sample data included in each of the first category and the second category.
10. The apparatus of claim 9, wherein the detection module comprises:
a comparison submodule, configured to compare a first quantity of sample data included in the first category with a second quantity of sample data included in the second category, and obtain a smaller value of the first quantity and the second quantity;
and the back door sample detection submodule is used for determining that the sample data of the target category comprises a back door sample when the smaller value acquired by the comparison submodule is smaller than a preset threshold value, and the sample data in the category corresponding to the smaller value is the back door sample.
11. The apparatus of claim 10, wherein the back door sample detection sub-module is further configured to determine that the pre-trained model is not implanted in the back door when no back door sample is detected in each category of the training samples.
12. The apparatus of any of claims 7-11, further comprising:
and the training module is used for training by using the training samples to obtain a trained model before the classification module classifies each sample data.
13. An electronic device, comprising:
at least one processor; and
at least one memory communicatively coupled to the processor, wherein:
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of any of claims 1 to 6.
14. A non-transitory computer-readable storage medium storing computer instructions that cause the computer to perform the method of any of claims 1-6.
CN202010334289.6A 2020-04-24 2020-04-24 Detection method and device for rear door sample and electronic equipment Active CN111291902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010334289.6A CN111291902B (en) 2020-04-24 2020-04-24 Detection method and device for rear door sample and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010334289.6A CN111291902B (en) 2020-04-24 2020-04-24 Detection method and device for rear door sample and electronic equipment

Publications (2)

Publication Number Publication Date
CN111291902A true CN111291902A (en) 2020-06-16
CN111291902B CN111291902B (en) 2020-08-25

Family

ID=71029666

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010334289.6A Active CN111291902B (en) 2020-04-24 2020-04-24 Detection method and device for rear door sample and electronic equipment

Country Status (1)

Country Link
CN (1) CN111291902B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112232446A (en) * 2020-12-11 2021-01-15 鹏城实验室 Picture identification method and device, training method and device, and generation method and device
CN112380974A (en) * 2020-11-12 2021-02-19 支付宝(杭州)信息技术有限公司 Classifier optimization method, backdoor detection method and device and electronic equipment
WO2024021526A1 (en) * 2022-07-29 2024-02-01 上海智臻智能网络科技股份有限公司 Method and apparatus for generating training samples, device, and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160249A (en) * 2015-07-02 2015-12-16 哈尔滨工程大学 Improved neural network ensemble based virus detection method
CN110188790A (en) * 2019-04-17 2019-08-30 阿里巴巴集团控股有限公司 The automatic generating method and system of picture sample
CN110298200A (en) * 2019-07-05 2019-10-01 电子科技大学 Asic chip hardware back door detection method based on temperature statistics signature analysis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105160249A (en) * 2015-07-02 2015-12-16 哈尔滨工程大学 Improved neural network ensemble based virus detection method
CN110188790A (en) * 2019-04-17 2019-08-30 阿里巴巴集团控股有限公司 The automatic generating method and system of picture sample
CN110298200A (en) * 2019-07-05 2019-10-01 电子科技大学 Asic chip hardware back door detection method based on temperature statistics signature analysis

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BRYANT CHEN等: "Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering", 《HTTPS://ARXIV.ORG/PDF/1811.03728.PDF》 *
HENRY CHAC´ON 等: "Deep Learning Poison Data Attack Detection", 《2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE》 *
MARCO BARRENO 等: "Can Machine Learning Be Secure?", 《 COMPUTER AND COMMUNICATIONS SECURITY》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380974A (en) * 2020-11-12 2021-02-19 支付宝(杭州)信息技术有限公司 Classifier optimization method, backdoor detection method and device and electronic equipment
CN112380974B (en) * 2020-11-12 2023-08-15 支付宝(杭州)信息技术有限公司 Classifier optimization method, back door detection method and device and electronic equipment
CN112232446A (en) * 2020-12-11 2021-01-15 鹏城实验室 Picture identification method and device, training method and device, and generation method and device
WO2024021526A1 (en) * 2022-07-29 2024-02-01 上海智臻智能网络科技股份有限公司 Method and apparatus for generating training samples, device, and storage medium

Also Published As

Publication number Publication date
CN111291902B (en) 2020-08-25

Similar Documents

Publication Publication Date Title
CN111291902B (en) Detection method and device for rear door sample and electronic equipment
CN111242291A (en) Neural network backdoor attack detection method and device and electronic equipment
CN110245714B (en) Image recognition method and device and electronic equipment
CN111368878B (en) Optimization method based on SSD target detection, computer equipment and medium
CN109214501B (en) Method and apparatus for identifying information
CN111753863A (en) Image classification method and device, electronic equipment and storage medium
WO2020177226A1 (en) Improved resnet-based human face in-vivo detection method and related device
CN112766284B (en) Image recognition method and device, storage medium and electronic equipment
CN112651311A (en) Face recognition method and related equipment
CN115758282A (en) Cross-modal sensitive information identification method, system and terminal
CN113140012B (en) Image processing method, device, medium and electronic equipment
CN111124863A (en) Intelligent equipment performance testing method and device and intelligent equipment
CN110781849A (en) Image processing method, device, equipment and storage medium
CN113239883A (en) Method and device for training classification model, electronic equipment and storage medium
CN111242322B (en) Detection method and device for rear door sample and electronic equipment
CN110852261B (en) Target detection method and device, electronic equipment and readable storage medium
CN111949766A (en) Text similarity recognition method, system, equipment and storage medium
CN111291901B (en) Detection method and device for rear door sample and electronic equipment
CN116503596A (en) Picture segmentation method, device, medium and electronic equipment
CN113033817B (en) OOD detection method and device based on hidden space, server and storage medium
CN113762308A (en) Training method, classification method, device, medium and equipment of classification model
CN113837101B (en) Gesture recognition method and device and electronic equipment
CN112214639A (en) Video screening method, video screening device and terminal equipment
CN112801960A (en) Image processing method and device, storage medium and electronic equipment
CN110765942A (en) Image data labeling method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant