CN115309854A - Countermeasure sample generation method and device and computer equipment - Google Patents

Countermeasure sample generation method and device and computer equipment Download PDF

Info

Publication number
CN115309854A
CN115309854A CN202110495562.8A CN202110495562A CN115309854A CN 115309854 A CN115309854 A CN 115309854A CN 202110495562 A CN202110495562 A CN 202110495562A CN 115309854 A CN115309854 A CN 115309854A
Authority
CN
China
Prior art keywords
sample
original sample
words
target
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110495562.8A
Other languages
Chinese (zh)
Inventor
周磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110495562.8A priority Critical patent/CN115309854A/en
Publication of CN115309854A publication Critical patent/CN115309854A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to artificial intelligence and provides a confrontation sample generation method, a confrontation sample generation device and computer equipment. The method comprises the following steps: obtaining an original sample of a target model; determining movable words in the original sample according to the service scene of the target model; and based on the verb, carrying out perturbation processing on the original sample to obtain a confrontation sample of the original sample, wherein the confrontation sample is used for detecting the target model. The method improves the effectiveness of challenge samples.

Description

Countermeasure sample generation method and device and computer equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a confrontation sample generation method and device and computer equipment.
Background
With the development of deep learning technology, in order to find the defects of the trained deep learning model in advance, an anti-attack mode is often adopted to construct an anti-sample active attack-trained deep learning model. The countermeasure attack means that by constructing countermeasure data, the countermeasure data is input to the target model as normal data and gets a deceptive or erroneous recognition result. One link in which counterattack is important is the generation of countersamples.
Research on the generation of antagonistic samples has focused on the manner in which the perturbations are processed. However, in practical terms, it is also important how to improve the effectiveness of the challenge sample. The effective countersample can truly reflect the anti-interference performance of the deep learning model. At present, the research related to this is very little, and therefore, how to improve the effectiveness of the generated challenge sample is a problem to be solved urgently.
Disclosure of Invention
In view of the above, it is necessary to provide a countermeasure sample generation method, apparatus, computer device, and storage medium capable of improving effectiveness of the countermeasure sample in view of the above technical problems.
A challenge sample generation method, the method comprising:
obtaining an original sample of a target model;
determining a verb in the original sample according to the service scene of the target model;
based on the verb, performing perturbation processing on the original sample to obtain an antagonistic sample of the original sample, wherein the antagonistic sample is used for detecting the target model.
A challenge sample generation device, the device comprising:
the original sample acquisition module is used for acquiring an original sample of the target model;
the preprocessing module is used for determining a verb in the original sample according to the service scene of the target model;
and the perturbation processing module is used for perturbing the original sample based on the verb to obtain an antagonistic sample of the original sample, and the antagonistic sample is used for detecting the target model.
A computer device comprising a memory and a processor, the memory storing a computer program, the processor when executing the computer program performing the steps of:
obtaining an original sample of a target model;
determining a verb in the original sample according to the service scene of the target model;
based on the verb, performing perturbation processing on the original sample to obtain an antagonistic sample of the original sample, wherein the antagonistic sample is used for detecting the target model.
A computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of:
obtaining an original sample of a target model;
determining a verb in the original sample according to the service scene of the target model;
based on the verb, performing perturbation processing on the original sample to obtain an antagonistic sample of the original sample, wherein the antagonistic sample is used for detecting the target model.
According to the method, the device, the computer equipment and the storage medium for generating the confrontation sample, after the original sample of the target model is obtained, the movable words in the original sample are determined according to the service scene of the target model, and based on the verb, the original sample is subjected to perturbation processing to obtain the corresponding confrontation sample. For the target model, the service target of the service scene is fixed, so that the verb can be determined according to the service scene of the target model, and the countermeasure sample obtained by disturbance processing can also conform to the service target of the service scene of the target model. Therefore, the generated countermeasure sample can conform to the service scene of the target model, and the effectiveness of the countermeasure sample is further improved.
Drawings
FIG. 1 is a diagram of an application environment of a challenge sample generation method in one embodiment;
FIG. 2 is a schematic flow chart of a challenge sample generation method in one embodiment;
FIG. 3 is a diagram illustrating a process of bundle searching in one embodiment;
FIG. 4 is an exemplary diagram of challenge sample generation in one embodiment;
FIG. 5 is a block diagram of the challenge sample generating device in one embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Natural Language Processing (NLP) is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between a person and a computer using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.
The scheme provided by the embodiment of the application relates to technologies such as artificial intelligence natural language processing and the like, and is specifically explained by the following embodiment:
the countermeasure sample generation method provided by the application can be applied to the application environment shown in fig. 1. The computer device 102 runs the target model, the computer device implements a countermeasure sample generation method, and an original sample of the target model is obtained; determining movable words in the original sample according to the service scene of the target model; based on the verb, the original sample is disturbed to obtain a confrontation sample of the original sample, and the confrontation sample is used for detecting the target model.
Wherein the computer device may be a server. The countermeasure sample generation method or apparatus as disclosed herein, wherein a plurality of servers can be grouped into a blockchain, and the servers are nodes on the blockchain.
In one embodiment, as shown in fig. 2, a method for generating a confrontational sample is provided, which is illustrated by taking the method as an example applied to the computer device in fig. 1, and includes the following steps:
at step 202, an original sample of the target model is obtained.
The target model is a to-be-detected anti-attack model, and the original sample can be a sample in a training set of the target model or a sample related to the target model and obtained by the existing public means. In this embodiment, samples in the target model training set are used as original samples.
And step 204, determining movable words in the original sample according to the service scene of the target model.
Verbs refer to words that can be perturbed in the original sample when the countermeasure sample is generated based on the original sample. Wherein, the movable words of the original sample are determined according to the business scene of the target model.
The object model is a business scenario, specifically, an application scenario for realizing a specific business object. For example, the service scene of the test question similarity analysis model is test question similarity analysis, such as a face recognition model, and the service scene is face recognition. The service scenes of different target models have different service targets, the service targets of the service scenes of some target models are subjected to similarity analysis, the service targets of the service scenes of some target models are subjected to identification, and the service targets of the service scenes of some target models are subjected to classification. For the goal model, the business goals of its business scenario are fixed, and thus the confrontation sample it generates should achieve its business goals. Therefore, the verb in the original sample can be determined according to the service target of the service scene of the target model, so that the confrontation sample generated by disturbance treatment can accord with the service target of the service scene of the target model.
For example, the target model is a test question similarity analysis model, the service scenario of the target model is test question similarity analysis, a verb in an original sample test question is determined based on the service scenario of the target model, a similar confrontation sample of the original sample is generated as a target, and the verb should not include keywords having key significance to the understanding of the question, so that the generated confrontation sample does not change the original question, and the target model can be detected based on the similar confrontation sample.
For another example, the target model is a tag classification model, the service scenario of the target model is to perform tag classification on a text such as news, and a verb in the original sample text is determined based on the service scenario of the target model, so as to generate a similar countermeasure sample of the original sample as a target, and the verb should not include a keyword having a key meaning for tag identification, so that the generated countermeasure sample does not change the original tag keyword, and further, the tag classification can be performed based on the similar countermeasure sample to detect the target model.
In order to obtain accurate movable words, the original sample needs to be preprocessed after being acquired. The preprocessing comprises the step of carrying out word segmentation processing on the original sample, and identifying the part of speech of a word.
Based on the preprocessing result of the original sample, filtering out the non-verbs, and obtaining the movable words of the original sample. Taking an original sample as a science test question as an example, segmenting the test question and identifying the part of speech of the word. The physical examination questions comprise various structures such as letters, formulas and Chinese characters, the physical examination questions usually need to be operated, and the change of the letters and the formulas means that the item meanings change, so the letters and the formulas are used as semantic keywords. And filtering out the letters and the formulas as non-verbs according to the preprocessing result, and taking other words except the letters and the formulas as verb-capable words.
And step 206, based on the verb, performing perturbation processing on the original sample to obtain a confrontation sample of the original sample, wherein the confrontation sample is used for detecting the target model.
Specifically, after the movable word is determined, the original sample is subjected to perturbation processing, and the perturbation processing may include processing modes such as adding, deleting, replacing, or converting. It should be understood that the perturbation process is based on movable words in the original sample. Taking the add processing as an example, when the add perturbation processing is performed on the original sample, a higher weight may be set for the movable word and a lower weight may be set for the non-movable word, so that the possibility of adding words before and after the movable word is increased, that is, the obtained confrontation sample is obtained after adding words, and the position of the added word is before and after the movable word. Taking the deletion process as an example, when the deletion process is performed on the original sample, the movable word can be deleted. Taking the replacement process as an example, when the original sample is subjected to the replacement process, the movable word can be replaced.
After the original sample is subjected to perturbation processing, a confrontation sample of the original sample is obtained. The challenge sample is used to test the target model. For example, the target model is a test question similarity analysis model, the generated countermeasure sample is a test question, the countermeasure sample is input into the test question similarity analysis model, the similar test question should be recognized, if the similar test question can be recognized, the performance of the target model reaches the standard, if the similar test question cannot be recognized, the performance of the target model does not reach the standard, and the target model can be debugged continuously based on the detection result.
The traditional generation of the countermeasure sample is generally to randomly select the verb disturbance, or to select the verb disturbance according to the importance, and then to perform the replacement processing based on the determined disturbance word, so as to obtain the countermeasure sample. The countersample generated in this way sometimes has a completely opposite meaning to the original sample, and when the counterattack is performed by taking the countersample as an input, the anti-interference performance of the model cannot be really reflected.
According to the method for generating the confrontation sample, after the original sample of the target model is obtained, the movable words in the original sample are determined according to the service scene of the target model, and based on the verb, the original sample is disturbed to obtain the corresponding confrontation sample. For the target model, the service target of the service scene is fixed, so that the verb can be determined according to the service scene of the target model, and the countermeasure sample obtained by disturbance processing can also conform to the service target of the service scene of the target model. Therefore, the generated countermeasure sample can conform to the service scene of the target model, and the effectiveness of the countermeasure sample is further improved.
In another embodiment, determining the actionable words in the original sample based on the business scenario of the object model comprises: determining an immobile word in an original sample according to a semantic keyword corresponding to a service scene of a target model; determining the movable words in the original sample based on the other words except the non-verb words in the original sample.
The semantic keywords are keywords which accord with the requirements of the service scene and have a large influence on the semantic meaning in the original sample, specifically are keywords which accord with the service target of the service scene and have a large influence on the semantic meaning. Taking the target model as an example of a test question similarity model, taking the original sample as a test question, and taking the semantic keywords as keywords which have a large influence on the test question meaning in the test question, such as formulas and letters in mathematics and chemical test questions. For chemical test questions and mathematical test questions, formulas and letters are usually the key of the question meaning, and thus the formulas and letters are used as semantic keywords of the test question similarity model business scene. Taking the target model as a label classification model as an example, and taking the original sample as news, the semantic keywords are keywords corresponding to the labels in the news text. If the original sample is labeled as home, the semantic keywords are words related to home in news, such as "sofa", "TV cabinet", and the like.
Specifically, words corresponding to the semantic keywords in the original sample are determined as non-verbs. No verb does not change during the perturbation process. For example, the target model is a test question similarity model, and formulas and letters in mathematics and chemical test questions are determined as non-verbs. For the label classification model, words related to the label of the original sample are used as non-verbs, and for example, "sofa", "television cabinet" and the like corresponding to the home label are used as non-verbs.
After determining the immobile words, the immobile words in the original sample are determined based on the other words in the original sample except the nonverb words. That is, verbs are words in the original sample other than the non-verbs. Taking the original sample as an example of a mathematical test, other words than formulas and letters are used as verb-possible words. Taking the original sample as news labeled "home", as an example, other words except for the words related to the "sofa" and "television cabinet" home supplies in the news are used as verbs.
After the movable words are determined, disturbance processing is carried out on the original sample after the movable words are determined, and as the fixed words are not disturbed and the verb is not a semantic keyword corresponding to the service scene of the target model, namely the semantic keyword is not disturbed, the semantics of the original sample is not changed, so that the countermeasure sample with the semantics similar to the original sample is obtained.
Specifically, in one embodiment, the service scenario of the original target model is similarity analysis of the physical examination questions, and the original sample is the physical examination questions. Determining the immobile words in the original sample according to the semantic keywords corresponding to the business scene of the target model, and determining the mobile words in the original sample based on the other words except the non-verb words in the original sample, wherein the steps of: and taking letters and formulas in the original sample as semantic keywords corresponding to the service scene of the target model, determining immobile words in the original sample according to the semantic keywords, and determining verb-capable words of the original sample based on other words except the non-verb words in the original sample.
Specifically, education has been greatly assisted in these years by the development of AI (artificial intelligence), and a series of algorithms for solving businesses derived from the intelligent education are continuously developed, including repeated topic identification models, similar topic identification models, and the like. In order to further understand the capabilities of these models, one of the spears and shields needs to actively attack the models and can find their defects in advance. In the field of intelligent education, because of the difference of the field, the test questions and texts can be more domain, for example, chinese test questions are mainly Chinese characters, english test questions are mainly English characters, and science test questions (such as mathematical test questions, chemical test questions, physical test questions and the like) can contain various symbols, characters, chinese characters and the like due to the common operation. The quality requirements for the attack-resistant text generation are more stringent and interpretable, which poses challenges to the algorithms for text generation.
When the business scene of the target model is the similarity analysis of the physical test questions (the target model can be a repeated question recognition model or a similar question recognition model), because the physical test questions such as the mathematical test questions, the chemical test questions and the physical test questions contain various structures such as characters, formulas and Chinese characters, and for the physical test questions, the change of the letters and the formulas means that the question ideas are changed, the letters and the formulas are used as semantic keywords of the business scene, namely the formulas and the letters are used as no verbs, and only the characters except the letters and the formulas are required to be disturbed, usually, the Chinese character description part is required.
In the field of subject test, by taking a formula and letters as non-verbs, the generated countermeasure text can disturb the test questions without changing the meaning of the test questions, so that the quality of disturbance is improved, and a targeted countermeasure attack target model is realized.
In another embodiment, based on the verb available, perturbing the original sample to obtain a confrontation sample of the original sample, includes: determining a target object to be subjected to perturbation processing based on the verb and the perturbation processing to be performed; the perturbation processing comprises one or more of replacement processing, deletion processing and addition processing, and the target object comprises one or more of a verb and a position to be perturbed; and (3) performing disturbance processing on the original sample based on the target object to obtain a confrontation sample of the original sample.
Specifically, the perturbation processing may be performed in a plurality of perturbation modes, including a replacement processing, a deletion processing, and an addition processing, and at least one perturbation processing is performed on the original sample. And determining a target object to be subjected to perturbation processing based on the verb and the perturbation processing to be performed. Specifically, the target object is one or more of a verb-possible and a perturbation position including perturbation processing. Taking the adding process as an example, the adding process is to add the perturbation word on the basis of the original sample, and the target object is the perturbation position including the perturbation word to be added. Taking the deletion process as an example, the deletion process is to delete the verb in the original sample, and the target object includes the movable word to be deleted and its position. Taking the replacement process as an example, the replacement process is to replace the movable word in the original sample with another word, and then the target object includes the movable word to be replaced and its position.
After the target object is determined, corresponding perturbation processing is carried out on the original sample, and a confrontation sample of the original sample is obtained. Specifically, the method includes performing perturbation processing on an original sample by using a prediction model, for example, adding words at corresponding positions, deleting verbs, or replacing verbs, so as to obtain a countermeasure sample, where the prediction model may be a model generated for a trained natural language or an unsupervised learning model. The predictive model generates challenge samples from the original samples. When generating the countermeasure sample using a predictive model, a beam search may be used how to generate the countermeasure sample. When the bundle parameter (beamSize) is set to 2, as shown in fig. 3, when the 1 st word is generated, the 2 words with the highest probability are selected, assumed to be a and C, and other branches are cut. When the 2 nd word is generated, the current sequences A and C are respectively combined with all words in the word list to obtain new sequences, 2 sequences with the highest scores are selected from the new sequences to serve as the current sequences, and AB and CE are obtained. And when a third word is generated, combining the current sequences AB and AE with all words in the word list respectively to obtain new sequences, and selecting 2 words with the highest scores from the new sequences to obtain ABD and CED. The idea of beam search (beam _ search) pruning is used to select one generation space of size B. The sequence of the generation space is circulated to predict the next word, and then the generation space compressed to the size of B is sorted again, so that the cost of generation is restrained while generalization is improved.
Specifically, when at least one of the perturbation processes includes an addition process, a target object to be subjected to the perturbation process includes a to-be-added word position of a to-be-added word in an original sample and a to-be-added word number corresponding to each to-be-added word position, and the perturbation process is performed on the original sample based on the target object to obtain a confrontation sample of the original sample, including: generating a first mask with the number of words to be added corresponding to each position of the words to be added to obtain a first input sequence; and predicting the target word of each first mask according to the first input sequence to obtain a confrontation sample of the original sample.
The target object to be subjected to the disturbance processing comprises the positions of words to be added of the words to be added in the original sample and the number of the words to be added corresponding to each position of the words to be added. For example, if N positions are randomly selected as the positions to be added with words, the number of words to be added at each position to be added is K, the K is a random number, and the value is between 1 and 3, then a corresponding number of first masks is generated according to the positions to be added with words and the number of words to be added at each position.
When the position of the word to be added is determined, different weights can be set for the non-verb and the movable word in the original sample, and the weights of the position of the word to be added before and after the verb can be increased, so that the probability of the increased word before and after the movable word is increased.
After removing the immobile words in the original sample, the resultant verb in the original sample is "is written correctly by the following formula? "for example, it indicates that each word in the sentence is perturbable. When adding words, whether some redundant words can be added before or after each word is assumed. For example, add "after" yes "and add" format "after" write ". Which is added in a manner consistent with the language model or next word prediction. Then, for the operation of adding disturbance, the practical way is to set a number N of added word positions, a number K of added words at each added word position, randomly select words at N positions as a disturbance MASK, where the number of added words at N positions is randomly between 1 and 3, for example, "is written correctly in the following formula? ", after masking with N =2,k being [2,3] respectively, it becomes" can write the following formula [ MASK ], exactly [ MASK ]? ".
For the first input sequence, "written in the following formula [ MASK ], is [ MASK ]? "sequentially adopting an MLM (Masked Language Model) mode to predict words of the mask, and selecting the optimal perturbation.
In the prediction process, a bundle searching is adopted to generate a target word, and according to the target word of the first mask of the predicted position, the mode that the target word of the first mask of the next position is predicted to be similar to the generation of an end-to-end (seq 2 seq) text is adopted, in the prediction process, each mask word has a plurality of candidates, only the first word cannot be selected, the generalization diversity is limited, all the candidates are not possible to be selected, and the whole searching path is very time-consuming. Specifically, a beam search (beam _ search) mode is adopted, and a B-size generation space is selected by using a beam search pruning idea. When predicting, the first B candidate words with the highest probability are used as target words, B is a bundling parameter (beamSize) of bundle search, and if the bundling parameter is set to 2, the first 2 words with the sorted scores are used as target words of the first mask in the first position, that is, there are multiple target words that will generate the first mask in the first position. The sequence of the generation space is then looped to make a prediction of the next position of the first MASK (MASK) target word. That is, according to the target word of the first mask of the predicted position, the target word of the first mask of the next position is predicted and then sorted again to be compressed to a generation space of B size, thereby sequentially increasing the generalization and also restricting the generation cost. As shown in fig. 3, B confrontation texts are generated by using the method. The sorting mode adopts the score average value of all MASK words. Wherein the score is a probability score or a confusion score.
In another embodiment, determining the target object to be perturbed based on the verb and the perturbation process to be performed comprises: when the at least one perturbation process includes a deletion process, determining a target object to be subjected to the deletion process based on the verb; the method for disturbing the original sample based on the target object to obtain a confrontation sample of the original sample comprises the following steps: carrying out word removal processing on a target object in the original sample, and obtaining a confrontation sample of the original sample according to the confusion degree of a text obtained after word removal; the confusion degree is used for representing the fit degree of the confrontation sample and the natural language, and the smaller the confusion degree is, the more the confrontation sample fits the natural language.
Specifically, the deletion perturbation process is to delete a word in the original sample, specifically, to perform a word removal process on a processing target with the movable word as the processing target.
Specifically, the word removal process first sets the number of deletions based on experience. In the embodiment, the deletion number K is set by using experimental data that one word can be deleted every 5 words, and the number K of the words to be deleted changes with the length of the text. K = len (words)/5. The deletion process may incorporate a bundle search. Specifically, each processing target is deleted in a polling mode to obtain a text obtained after each processing target is deleted, and the confusion degree of the text obtained after each processing target is deleted is calculated. And maintaining a space of the size of the generated cluster searching B, and acquiring the first B texts with the highest confusion scores to obtain a first deletable word which has B options. The deletion operation for the next word is performed according to the sequence of the B space. Specifically, for a text obtained after deleting a first deletable word, each processing target is continuously deleted in a round-robin manner to obtain a text obtained after deleting each processing target, the confusion degree of the text obtained after deleting the processing target is calculated, the first B texts with the highest confusion degree scores are selected to obtain a second deletable word, and after K processing targets are deleted in sequence, the B texts are obtained.
In another embodiment, the perturbation process is a replacement process. The replacement process is to replace the movable word with another word, and in practical application, the movable word can be replaced with one word or multiple words. Specifically, determining a target object to be subjected to perturbation processing based on the verb and the perturbation processing to be performed includes: when the at least one perturbation process includes a replacement process, determining a replaceable word from the movable words as a target object to be subjected to the replacement process; the method for carrying out perturbation processing on an original sample based on a target object to obtain a confrontation sample of the original sample comprises the following steps: generating a second mask corresponding to the replaceable word aiming at the replaceable word to obtain a second input sequence; and predicting the target words of the second mask according to the second input sequence to obtain the confrontation sample of the original sample.
Specifically, the movable word in the original sample is determined as the replaceable word as the processing target to which the replacement processing is to be performed. That is, the object of replacement is a verb in the original sample. The movable word is replaced with another target word. Wherein, one replacement word can be replaced by one target word or a plurality of target words. Specifically, for the replaceable words, the number of target words of the replaceable words can be randomly determined, and a corresponding number of second masks are generated to obtain a second input sequence. If an alternative word is randomly determined from the original target, and the number of target words of the alternative word is 2, two masks are generated at the position of the alternative word in the original sample, and a second input sequence is obtained. A process of predicting a target word of the second mask and adding a process prediction type.
For example, do you write correctly for "are the following formulas? "if the replaceable word is determined to be" formula "at random and the target word number of the replaceable word is two, two second MASKs MASK are generated at the position of the replaceable word, and a second input sequence is obtained, for example, as" is written correctly in the following [ MASK ]? ", and then inputs the second input sequence to the prediction model, predicting the target word of the second mask. In the generation process, the sequence may be generated by using the operation of the bundle search mentioned in the increase operation. Specifically, a target word is generated by using cluster searching, and a target word of a second mask at a next position is predicted according to a target word of a second mask at a predicted position. And selecting a generation space with the size of B by using the idea of cluster search pruning in a beam _ search mode. And when the target words of the second mask in the first position are predicted, taking the first B candidate words with the scores sorted as the target words, taking the B candidate words as bundle parameters (beamSize) of bundle searching, and if the bundle parameters are set to be 2, taking the first 2 words with the highest probability as the target words of the second mask in the first position, namely, a plurality of target words of the second mask in the first position are generated. The sequence of the generation space is then looped to make a prediction of the next position of the second MASK (MASK) target word. That is, the target words of the second mask at the next position are predicted from the target words of the second mask at the predicted positions, and then the target words of the second mask at the next position are sorted again and compressed to a generation space of size B, thereby sequentially increasing the generalization while also restricting the cost of generation. As shown in fig. 3, B confrontation texts are generated by adopting the method. The sorting mode adopts the score average value of all MASK words. The score may be a probability score or a confusion score.
In this embodiment, the disturbance is treated by adding, deleting, and replacing 3 ways, so that the diversity of the countermeasure samples can be increased, and the countermeasure samples can be effectively generated.
In another embodiment, the countermeasure sample generation method further comprises: carrying out effectiveness detection on the countermeasure samples obtained by each kind of disturbance treatment to obtain countermeasure texts meeting the effectiveness requirements; and sequencing the confrontation texts meeting the validity requirement according to the confusion degree, selecting the first N confrontation texts meeting the validity requirement, and obtaining a target confrontation sample of the original sample, wherein the confusion degree is used for representing the fitting degree of the confrontation sample and the natural language, and the smaller the confusion degree is, the more the confrontation sample fits the natural language.
Specifically, the validity detection means to detect whether the generated countermeasure sample is valid. The validity detection comprises repeated detection and non-preset character detection. Specifically, the detecting the challenge sample obtained by each perturbation process to obtain the challenge sample meeting the validity requirement includes: repeatedly detecting each confrontation sample obtained by disturbance treatment, and deleting the repeated confrontation samples; acquiring a detection requirement of a service scene; and for deleting the repeated confrontation samples, if the target words obtained by the disturbance processing meet the detection requirements, obtaining the confrontation texts meeting the validity requirements.
Wherein, the confrontation samples generated by a plurality of disturbance modes may have repetition, and the repeated confrontation samples can be deleted through the deduplication operation. The detection requirements of different service scenes are different, so that the generated countermeasure sample meets the requirements of the corresponding service scene. Taking test questions as an example, characters outside letters and formulas, namely texts, are used as verbs, and usually, the target words obtained by perturbation processing are also required to be texts, namely non-letters, numbers or characters. Therefore, the detection requirement can be set as whether the target word obtained by the perturbation processing contains numbers or letters, and if not, the target word is considered to be in accordance with the detection requirement of the service scene, and then the target word is taken as the confrontation text meeting the validity requirement.
For the confrontation text meeting the validity requirement, the confusion and the confusion loss are adopted, the confusion of the text is calculated for each generation sequence, and the smaller the confusion, the more the generation sequence fits to the natural language. Meanwhile, in order to ensure the quality, a threshold value 0.1 corresponding to the confusion loss is set, and the fluctuation condition of the generated text confusion is limited. And sequencing the countermeasure texts meeting the effectiveness requirements according to the confusion degree, and selecting the first N countermeasure texts meeting the effectiveness requirements to obtain the target countermeasure samples of the original samples. In practical application, in order to avoid the disturbance processing of the prediction model from generating target words with completely opposite semantics, if a new 'no' word is added and a 'no' word is deleted, a target countermeasure sample can be further manually screened from the countermeasure samples meeting the requirements, so that the target countermeasure sample meets the semantics to the maximum extent, the target countermeasure sample is similar to the original sample and the target countermeasure sample to the maximum extent, and the quality of the obtained countermeasure sample is high.
The actionable words in the original sample include "is written correctly in the following formula? For example, by adopting the processing modes of replacement, addition and deletion, respectively obtaining the confrontation samples is shown in fig. 4, the replacement disturbance operation replaces the formula with the arithmetic formula, replaces the correct formula with the accurate formula, replaces the formula with the formula, and replaces the correct formula with the proper formula. Add operation between the third word "write" and the fourth word "correct", add word "format", add which after the sixth word "yes", add word "between the first word" follow "and the second word" formula ", add word" and delete perturbation operation delete the first word "follow", and finally, based on validity detection and confusion, get a confrontation sample of "do the following formula write correctly? "," is the following formula written in the correct format? "," the formula writes correctly, and "what are the following formulas written correctly? "
The countermeasure sample generated by the countermeasure sample generation method can be applied to countermeasure attack application of a relevant model of the test questions in the intelligent education industry. Specifically, the original sample and the corresponding countercheck sample are input into the repeated question detection model to perform countercheck attack. If the detection model outputs a detection result which is not repeated, the repeated problem model has defects. In a typical business scene such as the intelligent education industry, the anti-attack method adopted by the invention can more effectively discover the defects of the repeated topic model of the subject science, and helps the model to improve the capability in the actual project. It can be understood that the method for generating the confrontation sample can be applied to other industries and other business scenes, and in the generation process, according to the business scene of the target model, the movable words in the original sample are determined, the fixed words in the original sample are determined, and the verb can be obtained, so that the confrontation sample can retain the original characteristics of the original sample to the maximum extent.
It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
In one embodiment, as shown in fig. 5, there is provided a countermeasure sample generation apparatus 500, which may be a part of a computer device using software modules or hardware modules, or a combination of the two, the apparatus 500 specifically includes:
an original sample obtaining module 502, configured to obtain an original sample of a target model;
a preprocessing module 504, configured to determine movable words in the original sample according to a service scenario of the target model;
and a perturbation processing module 506, configured to perform perturbation processing on the original sample based on the verb capable to obtain a countermeasure sample of the original sample, where the countermeasure sample is used to detect the target model.
After the countermeasure sample generation device obtains the original sample of the target model, the movable words in the original sample are determined according to the service scene of the target model, and based on the verb, the original sample is disturbed to obtain the corresponding countermeasure sample. For the target model, the service target of the service scene is fixed, so that the verb can be determined according to the service scene of the target model, and the countermeasure sample obtained by disturbance processing can also conform to the service target of the service scene of the target model. Therefore, the generated countermeasure sample can conform to the service scene of the target model, and the effectiveness of the countermeasure sample is further improved.
In another embodiment, the preprocessing module is configured to determine an immobile word in the original sample according to a semantic keyword corresponding to a business scenario of the target model, and determine an immobile word in the original sample based on words other than the non-verb word in the original sample.
In another embodiment, a disturbance handling module includes:
the disturbance target determination module is used for determining a target object to be subjected to disturbance processing based on the verb and the disturbance processing to be performed; the perturbation processing comprises one or more of replacement processing, deletion processing and addition processing, and the target object comprises one or more of a verb and a position to be perturbed;
and the disturbance module is used for carrying out disturbance processing on the original sample based on the target object to obtain a confrontation sample of the original sample.
In another embodiment, when at least one perturbation process includes an addition process, the target object to be subjected to the perturbation process includes the word positions to be added of the words to be added in the original sample and the number of words to be added corresponding to each word position to be added.
The disturbance module is used for generating a first mask with the number of words to be added corresponding to each position of the words to be added to obtain a first input sequence; and predicting the target word of each first mask according to the first input sequence to obtain a confrontation sample of the original sample.
In another embodiment, the perturbation target determining module is configured to determine, based on the verb-possible object, a target object to be subjected to a deletion process when the at least one perturbation process includes the deletion process;
the disturbance module is used for carrying out word removal processing on a target object in the original sample and obtaining a confrontation sample of the original sample according to the confusion degree of a text obtained after the word removal; the confusion degree is used for representing the fit degree of the confrontation sample and the natural language, and the smaller the confusion degree is, the more the confrontation sample fits the natural language.
In another embodiment, the perturbation target determination module is configured to determine, when the at least one perturbation process includes a replacement process, a replaceable word from the movable words as a target object to be subjected to the replacement process;
the perturbation module is used for generating a second mask corresponding to the replaceable word aiming at the replaceable word to obtain a second input sequence; and predicting the target words of the second mask according to the second input sequence to obtain the confrontation sample of the original sample.
In another embodiment, the system further comprises a screening module, which is used for performing effectiveness detection on the confrontation sample obtained by each perturbation treatment to obtain the confrontation sample meeting the effectiveness requirement; the countermeasure samples meeting the effectiveness requirement are sequenced according to the confusion degree, the first N countermeasure samples meeting the effectiveness requirement are selected to obtain target countermeasure samples of the original samples, the confusion degree is used for representing the attaching degree of the countermeasure samples and the natural language, and the smaller the confusion degree is, the more the countermeasure samples are attached to the natural language.
In another embodiment, the business scene of the target model is similarity analysis of the science test questions; (ii) a And the preprocessing module is used for taking letters and formulas in the original sample as semantic keywords corresponding to the service scene of the target model, determining the immobile words in the original sample according to the semantic keywords, and determining the verb of the original sample based on other words except the non-verb words in the original sample.
For the specific definition of the anti-sample generation device, reference may be made to the definition of the anti-sample generation method above, and details are not repeated here. The various modules in the countermeasure sample generation apparatus described above can be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device 600 is provided, which may be a server, the internal structure of which may be as shown in FIG. 6. The computer device includes a processor 601, memory and a network interface 602 connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a non-volatile storage medium 603, an internal memory 604. The non-volatile storage medium 603 stores an operating system, computer programs, and a database. The internal memory 604 provides an environment for the operating system and computer programs in the non-volatile storage medium to run. The database of the computer device is used for storing content data to be synchronized. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an anti-sample generation method.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), for example.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method of generating a challenge sample, the method comprising:
obtaining an original sample of a target model;
determining a verb in the original sample according to the service scene of the target model;
based on the verb, performing perturbation processing on the original sample to obtain an antagonistic sample of the original sample, wherein the antagonistic sample is used for detecting the target model.
2. The method of claim 1, wherein the determining movable words in the original sample according to the business scenario of the target model comprises:
determining an immobile word in the original sample according to a semantic keyword corresponding to a business scene of the target model, and determining a verb in the original sample based on other words except the immobile word in the original sample.
3. The method according to claim 1 or 2, wherein said perturbation processing on said original sample based on said verb to obtain a confrontation sample of said original sample comprises:
determining a target object to be subjected to perturbation processing based on the verb and the perturbation processing to be performed; wherein the perturbation processing comprises one or more of replacement processing, deletion processing and addition processing, and the target object comprises one or more of the verb and the position to be perturbed;
and carrying out disturbance processing on the original sample based on the target object to obtain a confrontation sample of the original sample.
4. The method according to claim 3, wherein when the at least one perturbation process includes an addition process, the target object to be perturbed includes a word position to be added of a word to be added in the original sample and a number of words to be added corresponding to each word position to be added, and the perturbing the original sample based on the target object to obtain an antagonistic sample of the original sample includes:
generating a first mask with the number of words to be added corresponding to each position of the word to be added to obtain a first input sequence;
and predicting the target word of each first mask according to the first input sequence to obtain a confrontation sample of the original sample.
5. The method of claim 3, wherein determining the target object to be perturbed based on the verbs and the perturbation process to be performed comprises:
when the at least one perturbation process comprises a deletion process, determining a target object to be subjected to the deletion process based on the verb;
the disturbing processing of the original sample based on the target object to obtain a confrontation sample of the original sample includes:
carrying out word removal processing on the target object in the original sample, and obtaining a confrontation sample of the original sample according to the confusion degree of a text obtained after word removal; the confusion degree is used for representing the fit degree of the confrontation sample and the natural language, and the smaller the confusion degree is, the more the confrontation sample fits the natural language.
6. The method of claim 3, wherein determining the target object to be perturbed based on the verbs and the perturbation process to be performed comprises:
when the at least one perturbation process comprises a replacement process, determining a replaceable word from the verb as a target object to be subjected to the replacement process;
the disturbing processing of the original sample based on the target object to obtain a confrontation sample of the original sample includes:
generating a second mask corresponding to the replaceable word aiming at the replaceable word to obtain a second input sequence;
and predicting the target words of the second mask according to the second input sequence to obtain the confrontation samples of the original samples.
7. The method according to any one of claims 3 to 6, further comprising:
carrying out effectiveness detection on each countermeasure sample obtained by disturbance treatment to obtain the countermeasure sample meeting the effectiveness requirement;
and sequencing the confrontation samples meeting the validity requirement according to the confusion degree, selecting the first N confrontation samples meeting the validity requirement, and obtaining the target confrontation samples of the original samples, wherein the confusion degree is used for representing the fitting degree of the confrontation samples and the natural language, and the smaller the confusion degree is, the more the confrontation samples are fitted with the natural language.
8. The method according to claim 2, wherein the business scenario of the target model is similarity analysis of a physical test question;
determining the immobile words in the original sample according to the semantic keywords corresponding to the service scene of the target model, and determining the mobile words in the original sample based on the other words except the immobile words in the original sample, wherein the determining comprises:
taking letters and formulas in the original sample as semantic keywords corresponding to the service scene of the target model, determining the fixed words in the original sample according to the semantic keywords, and determining the movable words of the original sample based on other words except the fixed words in the original sample.
9. A challenge sample generating device, the device comprising:
the original sample acquisition module is used for acquiring an original sample of the target model;
the preprocessing module is used for determining a verb capable in the original sample according to the service scene of the target model;
and the perturbation processing module is used for perturbing the original sample based on the verb to obtain an antagonistic sample of the original sample, and the antagonistic sample is used for detecting the target model.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.
CN202110495562.8A 2021-05-07 2021-05-07 Countermeasure sample generation method and device and computer equipment Pending CN115309854A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110495562.8A CN115309854A (en) 2021-05-07 2021-05-07 Countermeasure sample generation method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110495562.8A CN115309854A (en) 2021-05-07 2021-05-07 Countermeasure sample generation method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN115309854A true CN115309854A (en) 2022-11-08

Family

ID=83854029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110495562.8A Pending CN115309854A (en) 2021-05-07 2021-05-07 Countermeasure sample generation method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN115309854A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117874530A (en) * 2024-03-13 2024-04-12 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Challenge sample detection methods, apparatus, devices, media, and products

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117874530A (en) * 2024-03-13 2024-04-12 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) Challenge sample detection methods, apparatus, devices, media, and products

Similar Documents

Publication Publication Date Title
Sun et al. Joint type inference on entities and relations via graph convolutional networks
CN111723209B (en) Semi-supervised text classification model training method, text classification method, system, equipment and medium
Jiang et al. Learning to disentangle interleaved conversational threads with a siamese hierarchical network and similarity ranking
CN111046679B (en) Quality information acquisition method and device of translation model and computer equipment
US11625540B2 (en) Encoder, system and method for metaphor detection in natural language processing
CN112100377B (en) Text classification method, apparatus, computer device and storage medium
CN111866004B (en) Security assessment method, apparatus, computer system, and medium
CN112966088B (en) Unknown intention recognition method, device, equipment and storage medium
US11954202B2 (en) Deep learning based detection of malicious shell scripts
CN113672931A (en) Software vulnerability automatic detection method and device based on pre-training
CN111783903A (en) Text processing method, text model processing method and device and computer equipment
JP2021174503A (en) Limit attack method, device, and storage medium against naive bayes sorter
CN116015703A (en) Model training method, attack detection method and related devices
Wang et al. Multi-task multimodal learning for disaster situation assessment
CN115309854A (en) Countermeasure sample generation method and device and computer equipment
CN111177388B (en) Processing method and computer equipment
Zalmout et al. Prototype-representations for training data filtering in weakly-supervised information extraction
CN117112744A (en) Assessment method and device for large language model and electronic equipment
CN113918936A (en) SQL injection attack detection method and device
CN116361788A (en) Binary software vulnerability prediction method based on machine learning
CN111723301B (en) Attention relation identification and labeling method based on hierarchical theme preference semantic matrix
CN114297385A (en) Model training method, text classification method, system, device and medium
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium
Im et al. Multilayer CARU model for text summarization
CN113705244B (en) Method, device and storage medium for generating countermeasure text sample

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination