CN112948887B

CN112948887B - Social engineering defense method based on confrontation sample generation

Info

Publication number: CN112948887B
Application number: CN202110332011.XA
Authority: CN
Inventors: ***; 张雅鑫; 黄敏; 万上锋
Original assignee: Beijing Jiaotong University
Current assignee: Beijing Jiaotong University
Priority date: 2021-03-29
Filing date: 2021-03-29
Publication date: 2023-03-28
Anticipated expiration: 2041-03-29
Also published as: CN112948887A

Abstract

The invention provides a social engineering defense method based on confrontation sample generation, which comprises the following steps: collecting user related information in a historical leakage event, taking the user related information as a training set training attention mechanism model, and obtaining a model with the maximum attack strategy probability as a social engineering attack model; determining the grade of each word in the user related information according to the influence of each word in the user related information on the output result of the social engineering attack model; according to the score of each word in the user related information, the word with a certain influence degree on the output result of the social engineering attack model is used as a substitution table; and replacing words in the sample of the social engineering attack to be defended according to the replacement table to generate a defendable confrontation sample. The method provides an effective defense technology for the network attack based on social engineering through tiny sample disturbance, and has the excellent characteristics of low investment and high yield.

Description

Social engineering defense method based on confrontation sample generation

Technical Field

The invention relates to the field of information security, in particular to a method for generating a confrontation sample, which provides an effective defense technology for network attack based on social engineering.

Background

Various security detection and defense technologies and methods are proposed in academic circles and industrial circles aiming at the protection problem of personal and enterprise sensitive data in network systems, and are used for preventing data leakage. The existing security detection and defense technical method mainly uses security mechanisms such as data encryption, a firewall, an intrusion detection system, virus prevention software and the like to deal with information security threats and equipment security threats in a network so as to achieve the purposes of defending malicious attacks, protecting the safety of equipment and the inside of the network and preventing sensitive data from being leaked. However, the existing security detection and defense technology methods do not consider the importance and vulnerability of human factors to security detection and defense, so that social engineering-based network attacks are frequent.

No matter how the firewall, the encryption method, the intrusion detection system and the antivirus software guarantee the safety, the human factor is still the weakest link in the whole safety chain. In the implementation process of social engineering attack of known user information, an attacker needs to know the user information, such as a user name, a birth date, a mailbox address, a mobile phone number, preferences, oral habits, payment habits, tendencies, gender, position, geographic position, unit information, a corresponding social network account number, a home address and other social engineering information. Therefore, according to the information, the vulnerability of the current user on social engineering is analyzed and utilized, and a potential social worker vulnerability is found to carry out specific attack. Thereby managing individuals and businesses to reveal valuable and sensitive data. For example: knowing the attacker's mailbox address, the attacker can send an email to the attacker with a malicious document as an attachment. As long as the attacked is opened, the attacker can easily find the foothold in the network of the company to which the attacked belongs, and further, the attacker can learn more information of the company. According to the survey, social engineering attacks are one of the most serious attacks, and network systems all over the world are threatened.

According to the enemy situation, the confrontational samples are divided into two types: (1) A white-box mode, wherein a confrontation sample is generated in the internal structure and parameters of a model used by a known enemy; (2) Black box mode, the internal structure and parameters of the model used by unknown adversaries. However, the defender cannot know the user information held by the adversary, nor the policy that the adversary will take. The method effectively defends against social engineering attacks, can provide tighter protection for sensitive data of individuals and enterprises, and reduces property loss caused by the social engineering attacks.

Therefore, a defense method based on social engineering attack generated by resisting a sample is needed, which can disable the social engineering attack of known user information under the condition of unknown internal structure and parameters of an adversary model, and provides low-cost and strict security guarantee for sensitive data of individuals and enterprises.

Disclosure of Invention

The invention provides a social engineering defense method based on confrontation sample generation, which aims to overcome the defects in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme.

A method of socio-engineering defense based on confrontation sample generation, comprising:

s1, collecting user related information in a historical leakage event, taking the user related information as a training set training attention mechanism model, and obtaining a model with the maximum attack strategy probability as a social engineering attack model;

s2, determining the grade of each word in the user related information according to the influence of each word in the user related information on the output result of the social engineering attack model;

s3, according to the score of each word in the user related information, the word with a certain influence degree on the output result of the social engineering attack model is used as a substitution table;

and S4, replacing words in the sample of the social engineering attack to be defended according to the replacement table to generate a defendable countersample.

Preferably, the words in the sample of social engineering attacks to be defended are replaced according to the replacement table, and a defendable countersample is generated, including:

whether the user information in the substitution table is contained in comparison with the user related information items in the sample of the social engineering attack to be defended;

if the user information in the substitution table is contained, taking out the corresponding item, randomly selecting a character from the corresponding item, replacing the character with the character containing similar visual information, and putting the item after replacing the character back to the original position in the given text to finish replacing and generate a confrontation sample;

if the sample of the social engineering attack to be defended does not contain the user information in the substitution table, determining the score of each word in the user related information according to the influence of each word in the sample of the social engineering attack to be defended on the output result of the social engineering attack model, selecting three words with the largest influence degree in each user related information according to the score of each word in the user related information, randomly selecting one word in the corresponding word, substituting the word containing similar visual information with the word, returning the items after the word substitution to the original position in the given text, and completing the generation of the confrontation sample.

Preferably, the user-related information in the history divulging event includes a user's name, gender, date of birth, mailbox address, phone number, geographic location, unit information, job position, preferences, payment habits, and device information.

Preferably, the attention mechanism model is a neural network model of the attention mechanism based on a bidirectional long-short term memory network of the attention mechanism, and comprises:

1) An input layer: for inputting the contents of the training set to the model;

2) Embedding layer: mapping each term to a low-dimensional vector:

given a sentence consisting of T words: s = { x = ₁ ,x ₂ ,……,x _T Is passed through e _i ＝W ^wrd v ⁱ Each word x _i Converted into corresponding word vectors e _i Wherein, W ^wrd Is a matrix obtained by learning, v ⁱ Is a vector taking the total amount of words as a dimension;

3) LSTM layer: acquiring high-level characteristics from the embedded layer by using a bidirectional long-short term memory network;

4) Attention layer: generating a weight vector, combining the word-level features of each time step into a sentence-level feature vector by multiplying the weight vector, and finally according to h ^* = tanh (r) denotes a sentence in which r = H α ^T ，a＝softmax(w ^T M), M = tanh (H), H being the output vector H = [ H ] of the LSTM layer ₁ ,h ₂ ,…,h _T ]；

5) An output layer: and finally, using the statement level feature vector for relational classification, obtaining the probability of each attack strategy by using an activation function softmax, and outputting the strategy with the maximum probability of the attack strategy.

Preferably, determining the score of each word in the user-related information according to the influence of each word in the user-related information on the output result of the social engineering attack model includes: and respectively calculating the influence degree score of each word on the output result of the social engineering attack model by adopting a confidence coefficient method and a gradient loss function method, and adding the scores of the two methods to determine the score of each word in the user related information.

Preferably, determining the score of each word in the user-related information according to the influence of each word in the sample of the social engineering attack to be defended on the output result of the social engineering attack model comprises: and respectively calculating the influence degree score of each word on the output result of the social engineering attack model by adopting a confidence coefficient method and a gradient loss function method, and adding the scores of the two methods to determine the score of each word in the user related information.

Preferably, the confidence measure is as shown in equation (1) below:

wherein, C _F (w _k ,y _i ) Showing the influence degree score of the input ith sample word on the output result of the social engineering attack model, F showing the social engineering attack model, s _i Denotes the ith original sample, y _i Is a class label of the ith sample, w _k For the k-th word of a sample,

indicating the sample with the k word removed.

Preferably, the gradient loss function method is as shown in the following formula (2):

wherein, C _G (w _k ,y _i ) The score of the influence degree of the input ith sample word on the output result of the social engineering attack model is expressed, w _k For the kth word of the sample, F denotes the model, J is the loss function of the model, s _i Denotes the ith original sample, y _i Is the category label of the ith sample.

Preferably, words having a certain degree of influence on the output result of the social engineering attack model are used as a substitution table, including:

according to the score of each word in the user related information, taking the user related information as a sample, arranging all word scores in each sample from large to small, extracting three words with the highest scores from each sample to form an initial table, removing the duplication of the obtained initial table after all samples are processed, and collecting and storing replacement data to obtain a replacement table.

According to the technical scheme provided by the social engineering defense method based on the confrontation sample generation, the confrontation sample is generated in a black box mode, and the existing data leakage event is collected to serve as training data to train a neural network, so that a social engineering attack model is obtained; scoring each item in the user related information to measure the influence degree of the item on the output result of the social engineering attack model; selecting and collecting user information which has the greatest influence on the output result of the model from the user related information, and generating a substitution table; and replacing words in the sample of the social engineering attack to be defended according to the replacement table to generate a defensive countersample, so that the social engineering attack model is wrong, and the user is protected and prevented from being attacked by the social engineering attack. The method generates an effective countermeasure sample under the condition that social engineering attack and background are unknown, so that an adversary model generates errors, the social engineering attack of known user information is invalid, the preventability of the social engineering attack in the network security field is visually shown, the effective countermeasure sample generation method is provided, an effective defense technology is provided for the social engineering-based network attack of the known user information, and the method has the excellent characteristics of low investment and high yield.

Additional aspects and advantages of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart of a social engineering defense method based on generation of confrontation samples according to the embodiment;

FIG. 2 is a schematic diagram of a neural network model of the attention mechanism based on a bidirectional long-short term memory network of the attention mechanism;

FIG. 3 is a diagram illustrating the steps of measuring the importance of a word using a confidence measure and a gradient loss function;

fig. 4 is a schematic specific flow chart of generating a preventable countermeasure sample by replacing words in the sample of social engineering attacks to be defended according to the replacement table.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, but do not preclude the presence or addition of one or more other features, integers, steps, operations, and/or groups thereof. It should be understood that the term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.

It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

For the convenience of understanding of the embodiments of the present invention, the following description will be further explained by taking several specific embodiments as examples with reference to the drawings, and the embodiments of the present invention are not limited thereto.

Examples

In the implementation process of social engineering attack of known user information, an attacker needs to know the user information, such as user name, birth date, gender, mailbox address, mobile phone number, hobbies, spoken habits, payment habits, tendencies, positions, geographic positions, unit information, corresponding social network account numbers, home addresses and other information, so that the vulnerability of the current user in social engineering is analyzed, and specific attack is carried out. For example: given the email address of the attacker, the attacker can send an email to the attacker with a malicious document as an attachment, often named as a keyword that is significantly enticing for the target user, such as the salary structure distribution of the company where the attacker is located. As long as the attackers open the accessories, the attackers can easily find the foothold in the network of the company to which the attackers belong, and further, more sensitive information of the company is explored. According to the enemy situation, the confrontational samples are divided into 2 types: (1) A white-box mode, wherein a confrontation sample is generated under the condition that the internal structure and parameters of the model used by an adversary are known; (2) Black box mode, the internal structure and parameters of the model used by unknown adversaries. However, as a defensive party, it is impossible to know user information held by the enemy and to know a policy to be taken by the enemy. The present embodiment generates the challenge sample using the black box mode.

Fig. 1 is a schematic flow chart of a social engineering defense method based on generation of a confrontation sample according to this embodiment, and is characterized by including:

s1, collecting user related information in a historical leakage event, taking the user related information as a training set training attention mechanism model, and obtaining a model with the maximum attack strategy probability as a social engineering attack model.

The close correlation of the user data that has been revealed with the more sensitive data of the user, and even the company or organization to which the user belongs, is the basis for the social engineering attack by the attacker. However, these leaked data may form not only the basis of the social engineering attack but also a facilitator for defending against the social engineering attack. Therefore, in the step, events revealing user data are collected, and items of user related information are extracted and integrated to be used as training data to train a generation model so as to express a social engineering attack model. The leaked data includes the user's name, gender, date of birth, email address, phone number, geographic location, unit information, job title, preferences, payment habits, and device information. Since the original user information items collected are in the form of structured lists, the model requires as input data in the form of sequences of words such as "sentences". Therefore, the method converts the collected original data table into the sentences describing the relevant information of the user according to the sequence.

The attention mechanism model is a neural network model of an attention mechanism based on a bidirectional long-short term memory network of the attention mechanism, which utilizes the attention mechanism of the bidirectional long-short term memory neural network to automatically focus on words having a decisive influence on classification and capture the most important semantic information in sentences without using additional knowledge, and fig. 2 is a schematic structural diagram of the neural network model of the attention mechanism based on the bidirectional long-short term memory network of the attention mechanism, as shown in fig. 2, which comprises the following 5 parts:

1) An input layer: for inputting the content in the training set to the model;

2) Embedding layer: each term is mapped to a low-dimensional vector:

given a sentence consisting of T words: s = { x = ₁ ,x ₂ ,……,x _T Is passed through e _i ＝W ^wrd v ⁱ Each word x _i Converted into corresponding word vectors e _i Wherein W is ^wrd Is a matrix obtained by learning, v ⁱ Is a vector taking the total amount of words as a dimension;

4) Attention layer: generating a weight vector, combining the word-level features of each time step into a sentence-level feature vector by multiplying the weight vector, and finally according to h ^* = tanh (r) denotes a sentence in which r = H α ^T ，a＝soft max(w ^T M), M = tanh (H), H being the output vector H = [ H ] of the LSTM layer ₁ ,h ₂ ,…,h _T ]；

5) And (3) an output layer: finally, the statement level feature vectors are used for relation classification, the probability of each attack strategy is obtained by using an activation function softmax, and the strategy with the maximum attack strategy probability is output.

And S2, determining the grade of each word in the user related information according to the influence of each word in the user related information on the output result of the social engineering attack model.

The confidence coefficient method and the gradient loss function method are adopted to respectively calculate the influence degree score of each word on the output result of the social engineering attack model, the scores of the two methods are added to determine the degree of influence of all matters in the user related information on the output result of the social engineering attack model, the word with high score has large influence on the output result of the social engineering attack model, and the classification attribution of a section of words can be changed to a great extent when the word is slightly changed, so that errors are generated in the social engineering attack model. According to the method, through two evaluation methods, user information which has a large influence degree on the output result of the social engineering attack model is screened out and is changed in the subsequent step, and the enemy model is caused to be invalid in a more efficient mode.

The essence of the confidence method is that for a segment of characters, if one word is deleted, the confidence probability of the classification attribution of the segment of characters is changed to a greater extent, so that the word is proved to determine the classification attribution of the segment of characters to a greater extent, and the influence degree on the output of the model is greater. In this embodiment, the degree of influence of each word on the output result of the social engineering attack model is measured by a confidence method, so as to calculate a score, which is specifically shown in the following formula (1):

/>

indicating the sample with the k word removed.

The present embodiment uses the concept of FGSM (Fast Gradient Sign Method) to find the loss Gradient of the social engineering attack model for each word in the input sample sentence through back propagation. For a segment of characters, if the loss gradient of a certain word obtained through a back propagation algorithm is larger, the word can determine the classification attribution of the segment of characters to a greater extent, and the influence degree on the output result of the social engineering attack model is larger. Therefore, a gradient loss function method is specifically adopted to measure the influence degree of the words on the model output result to calculate the score, and the gradient loss function method is shown as the following formula (2):

wherein, C _G (w _k ,y _i ) Representing the degree of influence score, w, of the input ith sample word on the output result of the social engineering attack model _k For the ith word of the sample, F represents the model, J is the loss function of the model, s _i Represents the ith original sample, u _i Is the category label of the ith sample.

The score of the degree of influence of each word on the social engineering attack model in the sample obtained by adopting two evaluation methods of confidence coefficient and gradient loss function is respectively obtained: c _F And C _G . By the formula C = C _F +C _G And combining the two evaluation methods to obtain the final score of the influence degree of each word on the social engineering attack model.

Schematically, fig. 3 is a schematic diagram of a step of measuring word importance by using two methods of confidence coefficient and gradient loss function, referring to fig. 3, user information is converted into sentence input of "zhangyan, sudi, beijing, use, levono, Y7000P" as input, the confidence coefficient and gradient value of each character are calculated by inputting through a social engineering threat model, scores obtained by the two methods are added, and words are ranked according to the scores from high to low. The ordered text is "Levono, Y7000P, zyan, beijing, sudi, use".

And S3, according to the score of each word in the user related information, using the word with a certain influence degree on the output result of the social engineering attack model as a substitution table.

And S4, replacing words in the sample (such as e-mail schematically) of the social engineering attack to be defended according to the replacement table to generate a defensive countersample.

The method specifically comprises the following steps:

whether the user information in the substitution table is contained in the user related information items in the sample of the social engineering attack to be defended or not is judged;

if the sample of the social engineering attack to be defended does not contain the user information in the substitution table, determining the score of each word in the user related information according to the influence of each word in the sample of the social engineering attack to be defended on the output result of the social engineering attack model, selecting three words with the largest influence degree from each user related information according to the score of each word in the user related information, randomly selecting one word from the corresponding words, substituting the word containing similar visual information with the word, returning the items after the word substitution to the original position in the given text, completing the generation of the confrontation sample, and achieving the micro-disturbance.

In this embodiment, the words containing similar visual information to the words in the corresponding words are used instead. For human beings, the visual information contained in a word is decisive for human assessment of the meaning of the character, and words containing similar visual information will be easily recognized by human beings. For example: the latin letter "a" (Unicode 0041) contains similar visual information to the greek letter "Λ" (Unicode: 039 b), and the latin letter "a" (Unicode: 0061) contains the same visual information as the cyrillic letter "a" (Unicode: 0430). They are respectively judged by human as the upper case latin letter "a" and the lower case latin letter "a"; a Chinese character kettle (Unicode: 58f 8) and kettle (Unicode: 58f 6) also contain similar visual information. However, humans are significantly different from the way in which neural network models process text: current neural network models do not consider the notion of visual word similarity when processing text. When the model processes text, characters are regarded as discrete units forming a word, and the change of a single character easily causes the model to wrongly identify one word as another word, so that the model is wrong. The method utilizes different processing modes of human and the model for the text as blind points of the model, slightly disturbs the original sample by means of the same or similar visual information, and enables an adversary model used by an attacker to be wrong on the basis of ensuring that a user can restore the sample information normally, thereby leading to failure of social engineering attack of the known user information.

In step S2, the confidence method and the gradient loss function method are also used in this step to calculate the influence degree score of each word on the output result of the social engineering attack model, and the scores of the two methods are added to determine the score of each word in the user-related information.

Fig. 4 is a schematic diagram of a specific flow of generating a preventable countermeasure sample by replacing words in a sample of social engineering attack to be defended according to a substitution table in the embodiment of the present invention, and referring to fig. 4, based on an obtained substitution table, for a given sample of social engineering attack to be defended, first, whether user-related information items in a given text include "levono" in the substitution table is compared: if the item contains the 'levono' in the substitution table, taking out the corresponding item, replacing the 'l' (English letter l) in the 'levono' in the item with '1' (numeral 1), changing the item into '1 evono' after the replacement is finished, putting the replaced content back to the original position in the given text, and generating a confrontation sample; if the user information in the replacement table is not included in the given text (as shown, in "small de, present place, xu, one, student") the score of each of the user-related information is determined according to the influence of each of the samples of the social engineering attack to be defended on the output result of the social engineering attack model, three words having the greatest influence are selected from the user-related information according to the score of each of the user-related information, respectively, "student, small de, xu", then the "research" (Unicode: U7814) in the "student" is replaced with "end" (Unicode: U784F, which reads "y a" n "or" y a "n", is an alien word of the "research", is not "on the right side, is two" stems "), the terms after the completion of the work are replaced with" wordings ", and the terms after the replacement are returned to the original position in the given text, thereby completing the production of the sample.

In conclusion, the confrontation sample is generated based on the social engineering attack model, so that errors are generated in the social engineering attack model and the artificial judgment, further the social engineering attack based on the known user information is invalid, the social engineering attack is effectively defended, and powerful guarantee is established for the sensitive data of individuals and enterprises.

Those skilled in the art should understand that the above-mentioned application types of the input box are only examples, and other existing or future application types of the input box, such as those applicable to the embodiments of the present invention, should be included in the scope of the present invention and are also included herein by reference.

From the above description of the embodiments, it is clear to those skilled in the art that the present invention can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present invention may be embodied in the form of software products, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.

The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A social engineering defense method based on confrontation sample generation is characterized by comprising the following steps:

s4, replacing words in the sample of the social engineering attack to be defended according to the replacement table to generate a defendable confrontation sample; the method specifically comprises the following steps:

if the sample of the social engineering attack to be defended does not contain the user information in the substitution table, determining the grade of each word in the user related information according to the influence of each word in the sample of the social engineering attack to be defended on the output result of the social engineering attack model, selecting three words with the largest influence degree in each user related information according to the grade of each word in the user related information, randomly selecting one word in the corresponding word, substituting the word with the word containing similar visual information, returning the items after the word substitution to the original position in the given text, and completing the generation of the confrontation sample;

the determining the grade of each word in the user related information according to the influence of each word in the user related information on the output result of the social engineering attack model comprises the following steps: respectively calculating the influence degree score of each word on the output result of the social engineering attack model by adopting a confidence coefficient method and a gradient loss function method, and adding the scores of the two methods to determine the score of each word in the user related information;

the method for determining the score of each word in the user related information according to the influence of each word in the sample of the social engineering attack to be defended on the output result of the social engineering attack model comprises the following steps: and respectively calculating the influence degree score of each word on the output result of the social engineering attack model by adopting a confidence coefficient method and a gradient loss function method, and adding the scores of the two methods to determine the score of each word in the user related information.

2. The method of claim 1, wherein the user-related information in the history divulging event comprises a user's name, gender, date of birth, email address, phone number, geographic location, unit information, job title, preferences, payment habits, and device information.

3. The method of claim 1, wherein the attention model is a neural network model of attention based on a bidirectional long-short term memory network of attention, comprising:

1) An input layer: for inputting the contents of the training set to the model;

2) Embedding layer: mapping each term to a low-dimensional vector:

given a sentence consisting of T words: s = { x = ₁ ，x ₂ ，......，x _T Is passed through e _i ＝W ^wrd v ⁱ Each word x _i Into corresponding word vectors e _i Wherein W is ^wrd Is a matrix obtained by learning, v ⁱ Is a vector taking the total amount of words as a dimension;

4) Attention layer: generating a weight vector, combining the word-level features of each time step into a sentence-level feature vector by multiplying the weight vector, and finally according to h ^* = tanh (r) denotes a sentence in which r = H α ^T ，a＝softmax(w ^T M), M = tanh (H), H being the output vector H = [ H ] of the LSTM layer ₁ ，h ₂ ，…，h _T ]；

4. The method of claim 1, wherein the confidence measure is given by the following equation (1):

wherein, C _F (w _k ，y _i ) Showing the influence degree score of the input ith sample word on the output result of the social engineering attack model, F showing the social engineering attack model, s _i Represents the ith original sample, y _i Is a class label of the ith sample, w _k For the k-th word of a sample,

indicating the sample with the k word removed.

5. The method of claim 1, wherein the gradient loss function is represented by the following equation (2):

wherein, C _G (w _k ，y _i ) The score of the influence degree of the input ith sample word on the output result of the social engineering attack model is expressed, w _k For the kth word of the sample, F denotes the model, J is the loss function of the model, s _i Denotes the ith original sample, y _i Is the category label of the ith sample.

6. The method according to claim 1, wherein the using words having a certain influence on the output result of the social engineering attack model as a substitution table comprises: