CN115277216A - Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism - Google Patents
Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism Download PDFInfo
- Publication number
- CN115277216A CN115277216A CN202210905960.7A CN202210905960A CN115277216A CN 115277216 A CN115277216 A CN 115277216A CN 202210905960 A CN202210905960 A CN 202210905960A CN 115277216 A CN115277216 A CN 115277216A
- Authority
- CN
- China
- Prior art keywords
- feature
- features
- flow
- traffic
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 36
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000013145 classification model Methods 0.000 claims abstract description 22
- 238000012549 training Methods 0.000 claims abstract description 21
- 238000012360 testing method Methods 0.000 claims abstract description 15
- 238000001914 filtration Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 3
- 239000013598 vector Substances 0.000 claims description 35
- 230000006870 function Effects 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 10
- 238000013507 mapping Methods 0.000 claims description 9
- 238000004140 cleaning Methods 0.000 claims description 5
- 239000000126 substance Substances 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 4
- 230000006399 behavior Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000011159 matrix material Substances 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000013604 expression vector Substances 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1416—Event detection, e.g. attack signature detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/57—Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
- G06F21/577—Assessing vulnerabilities and evaluating computer system security
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2483—Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1433—Vulnerability analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention provides a vulnerability exploitation attack encryption flow classification method based on a multi-head self-attention mechanism. The method comprises the following steps: step 1: analyzing encrypted flow data to be classified into json format data, and filtering the analyzed data; step 2: analyzing metadata in the vulnerability exploitation attack encryption flow and key features in a TLS protocol, extracting core features required by flow classification, and finally converting the core features into a CSV format file; and step 3: dividing the processed encrypted traffic data into a training set and a test set in proportion, taking the malicious attack traffic type as a label, training a multi-head self-attention mechanism model by using the training set, judging the model by using the test set, and then optimizing the model to obtain a final vulnerability utilization attack encrypted traffic classification model; and 4, step 4: preprocessing the encrypted flow to be detected according to the step 1, then extracting features according to the step 2, and inputting the extracted features into a trained classification model to obtain a final flow classification result.
Description
Technical Field
The invention belongs to the field of classification of network flow vulnerability attacks, and relates to a vulnerability exploitation attack encryption flow classification method based on a multi-head self-attention mechanism.
Background
In recent years, with the rise of security awareness of network privacy of users, the use of TLS protocol for protecting communication information has become more and more popular. Hackers working on malicious attacks through the network also notice and exploit the excellent properties of this protocol to use it for insidious exploitation of malicious attack behavior. Because the encryption protocol encrypts most information of the data packet through an asymmetric encryption algorithm, data content in the traffic packet cannot be decrypted forcibly basically, and the traditional traffic classification method based on load matching and the like cannot act on such malicious encrypted traffic, which brings a serious challenge to network protection and management. Therefore, it is imperative to study how to classify and detect encrypted malicious attack traffic without decryption.
Considering classifying encrypted malicious attack traffic without decryption, the invention firstly makes a detailed study on the TLS protocol and performs feature selection around relevant features of the TLS traffic, thereby classifying benign and malicious TLS traffic, for example, extracting metadata-related information and TLS protocol-related information, and the like, thereby realizing classification and identification of different malicious attack encrypted traffic. After the features are extracted, the mainstream method is to simply reduce the dimensions of the flow data features, but ignore the correlation among the flow features, so that the accuracy of flow classification is low. To make efficient use of TLS traffic characteristics, the present invention proposes to use a multi-headed self-attention mechanism to learn key features in TLS traffic and to look for potential associations that exist between different features. And mapping different types of features to the same low-dimensional space, performing feature cross modeling in the low-dimensional space, identifying a feature combination with high correlation by using a multi-head self-attention mechanism, and constructing a key high-order feature for model classification. Based on the method, a Multi-head Self-Attention mechanism-based vulnerability attack encrypted traffic classification (TLS-MHSA) model is constructed, encrypted traffic can be classified under the condition of not decrypting, the TLS protocol characteristics of vulnerability attack encrypted traffic can be emphasized by a neural network, and the model can classify malicious encrypted traffic with high accuracy.
Disclosure of Invention
Based on the fact that the accuracy rate of classification of the exploit attack encryption traffic is improved in the prior art, the related characteristics of malicious encryption traffic cannot be fully utilized by the current deep school model, and therefore the problem is solved by the exploit attack encryption traffic classification method based on the multi-head self-attention mechanism.
The invention provides a vulnerability exploitation attack encryption flow classification method based on a multi-head self-attention mechanism, which comprises the following steps of:
step 2, analyzing metadata in the vulnerability exploitation attack encryption flow and key features in a TLS protocol, extracting core features required by flow classification, and finally converting the core features into a CSV format file;
step 3, dividing the processed encrypted traffic data into a training set and a test set in proportion, using the malicious attack traffic type as a label, training the multi-head self-attention mechanism network model by using the training set, judging the model by using the test set, optimizing the model and obtaining a final vulnerability utilization attack encrypted traffic classification model;
and 4, preprocessing the encrypted flow to be detected according to the step 1, then extracting features according to the step 2, and inputting the extracted features into a trained classification model to obtain a final flow classification result.
Further, the step 1 specifically comprises the following steps:
step 1.1, analyzing the encrypted flow data set into json format data;
step 1.2, cleaning and filtering the analyzed flow data, deleting redundant flow data, and only keeping TLS encrypted flow data;
and 1.3, disordering the sequence of original flow data and improving the generalization capability of the model after learning.
Further, the specific implementation of the step 2 includes the following steps:
step 2.1, TLS protocol encryption flow characteristic selection: acquiring the difference between parameters and extension information of a TLS (transport layer Security) protocol used by vulnerability exploitation attack encryption traffic and normal encryption traffic, and extracting the difference parameters and the difference extension information in the TLS protocol as key features of the TLS protocol encryption traffic;
step 2.2, extracting metadata features: extracting metadata characteristics, such as IP addresses, ports, access bytes and the like, which are possessed in the traffic data as auxiliary characteristics;
and 2.3, marking a label for each flow data after the characteristics are extracted, namely marking the actual type of the flow.
Further, the specific implementation of step 3 includes the following steps:
step 3.1, dividing the processed flow data into a training set and a test set according to the proportion of 8;
step 3.2, constructing a neural network vulnerability utilization attack encryption traffic classification model TLS-MHSA based on a multi-head self-attention MHSA mechanism, and inputting preprocessed traffic data into the TLS-MHSA, wherein the model comprises an input layer, an embedded layer, a multi-head self-attention layer and an output layer; firstly, inputting a preprocessed network flow characteristic vector x into an input layer, mapping all characteristics to the same low-dimensional space through an embedding layer, and outputting a low-dimensional vector; then, mapping the vectors to a plurality of subspaces through a multi-head self-attention mechanism to be combined into different high-order features; multiple high-order feature combinations can be obtained through stacking of multiple multi-head self-attention layers, and effectiveness of the feature combinations is judged through an attention mechanism; finally, inputting the feature combination vector acquired by the upper layer into the full connection layer, and outputting a classification result through a softmax function;
and 3.3, training the classification model by using the training set divided in the step 3.1, and judging and optimizing parameters by using a corresponding test set to obtain a final vulnerability exploitation attack encryption flow classification model.
Further, the method also comprises the step of extracting the password suite, the TLS extension and the related information of the TLS extension as features, wherein the features are used as an important ring in a handshaking process when the TLS protocol establishes connection, and non-critical parameter information in the TLS protocol such as client key length and the like is reserved as auxiliary features.
Further, the step of obtaining the combination of the high-order features is as follows, if the feature is a, the following steps are included:
(1) Selecting any other feature b, and calculating the relevance scores of the features a and b under the attention head h, wherein the calculation formula is as follows:
wherein S is(h)For the attention scoring function, a common dot product model was chosen, the formula being as follows:
wherein the content of the first and second substances,all belong to transformation matrices, and the dimensions of all the transformation matrices are d' x d, that is, the original embedding space RdMapping to a new space Rd′Can respectively convert the flow characteristics xaAnd xbVector representation as eaAnd ebI.e. by transforming the feature vector from a space in d dimensionsSpace to d' dimension;
(2) Coefficient of passageAll associated features are directed to update the attention weight of feature a in subspace h, i.e. each feature is represented as a weighted sum of all other relevant features. The new feature learned can be expressed as the following formula:
wherein the content of the first and second substances,is also a feature of d 'x d, d' dimensional spaceIs the feature x under subspace haThe combined characteristic obtained by crossing the relevant characteristic with the relevant characteristic represents that a new combined characteristic is learned by the method;
(3) The self-attention is expanded from one head to a plurality of heads by utilizing a multi-head self-attention mechanism, so that different feature cross information can be learned from subspaces represented by different heads, and feature cross results learned by different heads can be connected in series according to the following formula, wherein the feature cross results are obtained by connecting different heads in series, and the self-attention is expanded from one head to a plurality of heads by utilizing a multi-head self-attention mechanismThe symbol represents the operation in series, H represents the number of heads used by the multi-head self-attention mechanism,is the feature x under subspace i (i =1,2, \ 8230;, H)aCombined features crossed with their related features, feature crossing resultIs disclosedThe formula is as follows:
in order to enable the model to learn high-order traffic characteristics and simultaneously retain low-order original traffic characteristics, the method also adds a classical residual error network into a multi-head self-attention layer, wherein W isResIs to mix eaDimension of andaligned, reLU (t) = max (0, t) is a non-linear activation function, the formula is as follows:
the output layer is a classifier composed of a full connection layer and softmax together. Wherein, the full connection layer maps the input feature vector to the sample mark space R through linear transformationcIn (3), obtaining a vector z ∈ RCWhere C is the total number of classes of traffic to be classified. And then classifying by using a softmax classification function to obtain a final classification result.
Compared with the prior art, the invention has the beneficial effects that:
1. the method is characterized in that key feature information carried by a TLS protocol in the encrypted traffic is fully utilized, a multi-head self-attention mechanism vulnerability exploitation encrypted traffic classification model TLS-MHSA is provided, the model learns the potential relevance of key features and different features in the preprocessed traffic data by using the self-attention mechanism, and further learns important high-order combined feature information by using the multi-head mechanism, so that the accuracy of malicious attack encrypted traffic classification is improved.
Drawings
Fig. 1 is a general flowchart of a vulnerability detection method based on an improved time convolution network.
Fig. 2 is a frame diagram of a malicious attack encryption traffic classification method based on a multi-head self-attention mechanism.
FIG. 3 is a diagram of the TLS-MHSA model architecture.
FIG. 4 is a diagram of a multi-headed self-attention layer structure in the TLS-MHSA model.
FIG. 5 is data sample set information used in the experimental segment of the present invention.
FIG. 6 is a confusion matrix of the TLS-MHSA classification results.
Fig. 7 is a comparison of classification results of three malicious attack encryption traffic classification methods.
FIG. 8 shows the comparison of the accuracy of comparison experiments of RF, deepFM and TLS-MHSA proposed by the present invention, three kinds of exploits attack encryption traffic classification models.
FIG. 9 shows recall ratio comparisons of comparison experiments of RF, deepFM, and TLS-MHSA proposed by the present invention, three kinds of exploits attack encryption traffic classification models.
FIG. 10 shows F1-measure value comparison of comparison experiments of RF, deepFM and TLS-MHSA proposed by the present invention, three kinds of exploits attack encryption traffic classification models.
Detailed Description
The invention will be further described with reference to the accompanying drawings and embodiments, which are described for the purpose of facilitating an understanding of the invention and are not intended to be limiting in any way.
The invention aims to provide a classification method of the exploit attack encryption traffic based on a multi-head self-attention mechanism aiming at the exploit attack encryption traffic classification so as to effectively classify the exploit attack encryption traffic. The invention provides a perfect classification model, and fully tests are carried out to prove the effectiveness and feasibility of the method.
As shown in fig. 1 to 10, the method for classifying exploit attack encryption traffic based on the multi-head self-attention mechanism provided by the present invention includes:
step 1.1, analyzing the encrypted flow data set into json format data;
step 1.2, cleaning and filtering the analyzed flow data, deleting redundant flow data, and only keeping TLS encrypted flow data;
and 1.3, disordering the sequence of the original flow data.
The purpose of analyzing the data into json format data in the embodiment of the invention is to facilitate subsequent operations such as flow filtering and the like, and the purpose of deleting redundant flow data is to prevent the model from learning useless information to reduce the classification effect, and to disturb the original flow data sequence so that the model has stronger generalization capability after learning.
Step 2, analyzing metadata in the vulnerability exploitation attack encryption flow and key features in a TLS protocol, and extracting core features required by flow classification;
step 2.1, TLS protocol encryption flow characteristic selection: acquiring the difference between parameters and extension information of a TLS (transport layer Security) protocol used by vulnerability exploitation attack encryption traffic and normal encryption traffic, and extracting the difference parameters and the difference extension information in the TLS protocol as key features of the TLS protocol encryption traffic;
step 2.2, metadata feature extraction: extracting metadata characteristics such as IP addresses, ports, access bytes and the like which are possessed in the traffic data as auxiliary characteristics;
and 2.3, marking a label for each piece of flow data after the characteristics are extracted, namely marking the actual type of the flow.
The characteristic extraction of the flow in the embodiment of the invention can be divided into two types, namely metadata characteristic selection and TLS protocol encryption flow characteristic selection. The TLS protocol encrypted flow characteristics mainly refer to parameter information and TLS extension information in the protocol, and the invention extracts the cipher suite, the TLS extension and related information of the TLS extension as characteristics. As an important ring in the handshaking process when the TLS protocol establishes a connection, the related information of the TLS certificate is also worth being used as a traffic characteristic. In addition, the invention also reserves non-key parameter information in TLS protocol such as client key length as auxiliary characteristic, so that the integral TLS encrypted flow characteristic is more comprehensive, and the subsequent classification process obtains better effect.
Metadata characteristics are characteristics in metadata owned by all traffic and are traffic characteristics used by traditional traffic classification, such as IP address, port, ingress and egress byte, etc. Most of the time, the IP address is meaningless for identifying the malicious traffic, and judgment of the model is easily misled, so all IP address information is deleted in the feature extraction. It makes sense to keep some metadata malicious attackers, such as incoming and outgoing bytes, incoming and outgoing packets, as they are only affected by the transmitted data, and considering that normal traffic and malicious traffic also have some differences in these behaviors. The invention uses the traditional standard port matching identification method for reference, and takes the port as one of the flow characteristics. Considering some differences in behavior between malicious traffic and benign traffic, the present invention also considers the relevant features of adding window sequence statistics in which the present system takes a method of using a markov transition matrix for capturing the relationship between adjacent packets. In addition to the characteristics, the byte distance average value, the standard deviation and the byte entropy value are reserved for characteristic identification, and the characteristics serving as the metadata of the traffic can help the model to improve the classification accuracy of the malicious attack encryption traffic.
Step 3, dividing the preprocessed encrypted flow data into a training set and a test set according to a proper proportion, taking different types of malicious attack flows as labels, training a multi-head self-attention mechanism network model by using the training set data, judging the model by using the test set data, and constructing a final vulnerability utilization attack encrypted flow classification model after optimization;
step 3.1, dividing the processed flow data into a training set and a test set according to a proper proportion;
and 3.2, constructing a neural network vulnerability utilization attack encryption traffic classification model TLS-MHSA based on the multi-head self-attention mechanism, and inputting the preprocessed traffic data into the TLS-MHSA, wherein the model comprises an input layer, an embedded layer, a multi-head self-attention layer and an output layer.
The input layer will base the input characteristics onThe difference in feature types translates into corresponding feature vectors, e.g., discrete features translate into unique heat vectors. Wherein A represents the total number of feature classes, xiThe ith feature class is represented.
x=[x1;x2;...;xA]
The main role of the embedding layer is to convert sparse feature vectors into dense feature vectors suitable for learning. Mapping the feature vectors processed by the input layer into a low-dimensional space, and representing each flow classification feature by a plurality of low-dimensional vectors, OiRepresenting an embedded matrix with a feature type i correspondence, xiThen it is the one-hot coded expression vector of the corresponding feature of the feature type, and the conversion formula is as follows:
ei=Oixiif the feature is a multi-valued feature, in this case the class variable xiNot a single heat vector but a multiple heat vector. In order to be compatible with such a case of multi-valued input, a multi-valued feature type S is expressed as an average value of corresponding feature vector vectors to normalize the features, where v represents the number of the multi-valued feature types, and a conversion formula is as follows:
ei=1/vOixi
in order to enable the discrete features and the continuous features to be combined with each other, the continuous features are mapped into a low latitude dense feature vector space, and the continuous features are expressed as the multiplication result of feature values and corresponding embedded vectors. Wherein v isaIs an embedded vector of feature type a, xaIs a scalar value. The continuous features can be expressed as the following formula:
ea=vaxa
the role of the multi-head self-attention layer is to capture the correlation between multi-flow features and select meaningful features for high-order combination. The core operation in the attention mechanism is to directly obtain the attention weight of the feature combination through operation, and then judge the importance of the feature combination through weight and. The multi-head self-attention mechanism is to perform multi-group self-attention processing on input classified feature vectors, and then splice all the self-attention processing results together to perform linear transformation to obtain a final result.
The step of obtaining the important high-order feature combination is as follows, taking the feature a as an example, and explaining how to find the important high-order feature of the design feature a:
(1) Selecting any other feature b, and calculating the relevance scores of the features a and b under the attention head h, wherein the calculation formula is as follows:
wherein S is(h)For the attention scoring function, a common dot product model was chosen here, the formula being as follows:
wherein, the first and the second end of the pipe are connected with each other,all belong to transformation matrices, and the dimensions of all the transformation matrices are d' x d, that is, the original embedding space RdMapping to a new space Rd′Can respectively convert the flow characteristics xaAnd xbVector representation as eaAnd ebI.e. to change the feature vector from a space in d dimension to a space in d' dimension.
(2) Coefficient of passageAll associated features are directed to update the attention weight of feature a in subspace h, i.e. each feature is represented as a weighted sum of all other relevant features. The new features learned can be expressed as the following formula:
wherein the content of the first and second substances,is also a feature of d 'x d, d' dimensional spaceIs the feature x under subspace haThe combined feature obtained by crossing the relevant related features represents a new combined feature learned by the method.
(3) The self-attention is expanded from one head to a plurality of heads by utilizing a multi-head self-attention mechanism, so that different feature intersection information can be learned from subspaces represented by different heads, and feature intersection results learned by different heads can be connected in series according to the following formula, wherein the feature intersection results are obtained by connecting different heads in seriesThe symbol represents the operation in series, H represents the number of heads used by the multi-head self-attention mechanism,is the feature x under subspace i (i =1,2, \ 8230;, H)aCombined features crossed with their related features, feature crossing resultThe formula of (1) is as follows:
in order to enable the model to learn the high-order traffic characteristics and simultaneously retain the low-order original traffic characteristics, the method also adds the classical residual error network into the multi-head self-attention layer. Wherein WResIs to mix eaDimension of andaligned, reLU (t) = max (0, t) is a non-linear activation function, the formula is as follows:
the output layer is a classifier composed of a fully connected layer and softmax together. Wherein, the full connection layer maps the input characteristic vector to the sample mark space R through linear transformationcIn (3), obtaining a vector z ∈ RCWhere C is the total number of classes of traffic to be classified. And then classifying by using a softmax classification function to obtain a final classification result.
And 3.3, training the classification model by using the training set divided in the step 3.1, and judging and optimizing parameters by using a corresponding test set to obtain a final vulnerability exploitation attack encryption flow classification model.
The optimization method used by the invention is an Adaptive learning rate algorithm (Adam), and model parameters are saved after an optimal model is trained. In the relevant parameter piProbability vector, y, representing the result of predictive detectioniRepresenting the category of the actual sample label, K represents the total number of categories of the sample label, and Adam has the specific calculation formula:
and 4, analyzing and cleaning the encrypted flow to be detected according to the flow in the step 1, extracting features according to the mode in the step 2, inputting the extracted feature data into a trained classification model, and obtaining a final flow classification result.
Claims (6)
1. A vulnerability exploitation attack encryption flow classification method based on a multi-head self-attention mechanism is characterized by comprising the following steps:
step 1, analyzing encrypted flow data to be classified into json format data, and filtering the analyzed data;
step 2, analyzing metadata in the vulnerability exploitation attack encryption flow and key features in a TLS protocol, extracting core features required by flow classification, and finally converting the core features into a CSV format file;
step 3, dividing the processed encrypted traffic data into a training set and a test set in proportion, taking the malicious attack traffic type as a label, training the multi-head self-attention mechanism network model by using the training set, judging the model by using the test set, optimizing the model and obtaining a final vulnerability utilization attack encrypted traffic classification model;
and 4, preprocessing the encrypted flow to be detected according to the step 1, then extracting features according to the step 2, and inputting the extracted features into a trained classification model to obtain a final flow classification result.
2. The method for classifying the exploit attack encrypted traffic based on the multi-head self-attention mechanism as claimed in claim 1, wherein the step 1 is implemented by the following steps:
step 1.1, analyzing the encrypted flow data set into json format data;
step 1.2, cleaning, filtering and analyzing the flow data, deleting redundant flow data, and only keeping TLS encrypted flow data;
and 1.3, disturbing the sequence of the original flow data for improving the generalization capability of the model after learning.
3. The method as claimed in claim 1, wherein the step 2 is implemented by the following steps:
step 2.1, TLS protocol encryption flow characteristic selection: acquiring the difference between parameters and extension information of a TLS (transport layer Security) protocol used by vulnerability exploitation attack encryption traffic and normal encryption traffic, and extracting the difference parameters and the difference extension information in the TLS protocol as key features of the TLS protocol encryption traffic;
step 2.2, metadata feature extraction: extracting metadata characteristics such as IP addresses, ports, access bytes and the like which are possessed in the traffic data as auxiliary characteristics;
and 2.3, marking a label for each flow data after the characteristics are extracted, namely marking the actual type of the flow.
4. The method as claimed in claim 1, wherein the step 3 is implemented by the following steps:
step 3.1, dividing the processed flow data into a training set and a test set according to the proportion of 8;
step 3.2, a neural network vulnerability utilization attack encryption traffic classification model TLS-MHSA based on a multi-head self-attention MHSA mechanism is constructed, preprocessed traffic data are input into the TLS-MHSA, and the model comprises an input layer, an embedding layer, a multi-head self-attention layer and an output layer; firstly, inputting a preprocessed network flow characteristic vector x into an input layer, then mapping all characteristics to the same low-dimensional space through an embedding layer, and outputting a low-dimensional vector; then, mapping the vectors to a plurality of subspaces through a multi-head self-attention mechanism to be combined into different high-order features; multiple high-order feature combinations can be obtained through stacking of multiple multi-head self-attention layers, and effectiveness of the feature combinations is judged through an attention mechanism; finally, inputting the feature combination vector acquired by the upper layer into the full connection layer, and outputting a classification result through a softmax function;
and 3.3, training the classification model by using the training set divided in the step 3.1, and judging and optimizing parameters by using a corresponding test set to obtain a final vulnerability exploitation attack encryption flow classification model.
5. The method of claim 3, further comprising extracting the cipher suite and the TLS extension and the information related to the TLS extension as features, wherein the features are used as an important ring in a handshake process when the TLS protocol establishes a connection, and non-critical parameter information in the TLS protocol such as a client key length is reserved as an auxiliary feature.
6. The method of claim 1 wherein the step of obtaining the combination of high order features comprises, if feature a:
(1) Selecting any other feature b, and calculating the relevance scores of the features a and b under the attention head h, wherein the calculation formula is as follows:
wherein S is(h)For the attention scoring function, a common dot product model was chosen, the formula being as follows:
wherein the content of the first and second substances,all belong to transformation matrices, and the dimensions of all the transformation matrices are d' x d, that is, the original embedding space RdMapping to a new space Rd′Can respectively convert the flow characteristics xaAnd xbVector representation as eaAnd ebThat is, the feature vector is changed from a space of d dimensions to a space of d' dimensions;
(2) Coefficient of passageAll associated features are directed to update the attention weight of feature a in subspace h, i.e. each feature is represented as a weighted sum of all other relevant features. The new features learned can be expressed as the following formula:
wherein the content of the first and second substances,is also a feature of d 'x d, d' dimensional spaceIs the feature x under subspace haThe combined characteristic obtained by crossing the relevant characteristic with the relevant characteristic represents a new combined characteristic learned by the method;
(3) The self-attention is expanded from one head to a plurality of heads by utilizing a multi-head self-attention mechanism, so that different feature intersection information can be learned from subspaces represented by different heads, and feature intersection results learned by different heads can be connected in series according to the following formula, wherein the feature intersection results are obtained by connecting different heads in seriesThe symbol represents the operation in series, H represents the number of heads used by the multi-head self-attention mechanism,is the feature x under subspace i (i =1,2, \ 8230;, H)aCombined features crossed with their related features, feature crossing resultThe formula of (1) is as follows:
in order to enable the model to learn the high-order flow characteristics and simultaneously retain the low-order original flow characteristics, the invention also adds a classical residual error network into a multi-head self-attention layer, wherein W isResIs to mix eaDimension of andaligned, reLU (t) = max (0, t) non-linear activationThe function, the formula is as follows:
the output layer is a classifier composed of a fully connected layer and softmax together. Wherein, the full connection layer maps the input feature vector to the sample mark space R through linear transformationcIn (3), obtaining a vector z ∈ RCWhere C is the total number of classes of traffic to be classified. And then classifying by using a softmax classification function to obtain a final classification result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210905960.7A CN115277216A (en) | 2022-07-29 | 2022-07-29 | Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210905960.7A CN115277216A (en) | 2022-07-29 | 2022-07-29 | Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115277216A true CN115277216A (en) | 2022-11-01 |
Family
ID=83771352
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210905960.7A Pending CN115277216A (en) | 2022-07-29 | 2022-07-29 | Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115277216A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116090537A (en) * | 2023-03-07 | 2023-05-09 | 特斯联科技集团有限公司 | Training method of commodity recommendation model |
CN116506216A (en) * | 2023-06-19 | 2023-07-28 | 国网上海能源互联网研究院有限公司 | Lightweight malicious flow detection and evidence-storage method, device, equipment and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113705619A (en) * | 2021-08-03 | 2021-11-26 | 广州大学 | Malicious traffic detection method, system, computer and medium |
CN114330544A (en) * | 2021-12-28 | 2022-04-12 | 国网冀北电力有限公司信息通信分公司 | Method for establishing business flow abnormity detection model and abnormity detection method |
CN114745155A (en) * | 2022-03-14 | 2022-07-12 | 河海大学 | Network abnormal flow detection method, device and storage medium |
-
2022
- 2022-07-29 CN CN202210905960.7A patent/CN115277216A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113705619A (en) * | 2021-08-03 | 2021-11-26 | 广州大学 | Malicious traffic detection method, system, computer and medium |
CN114330544A (en) * | 2021-12-28 | 2022-04-12 | 国网冀北电力有限公司信息通信分公司 | Method for establishing business flow abnormity detection model and abnormity detection method |
CN114745155A (en) * | 2022-03-14 | 2022-07-12 | 河海大学 | Network abnormal flow detection method, device and storage medium |
Non-Patent Citations (3)
Title |
---|
JINFU CHEN等: "TLS-MHSA: An Efficient Detection Model for Encrypted Malicious Traffic based on Multi-Head Self-Attention Mechanism", 《ACM》, vol. 26, no. 4, 31 October 2023 (2023-10-31), XP059195532, DOI: 10.1145/3613960 * |
李恒;沈华伟;程学旗;翟永;: "网络高流量分布式拒绝服务攻击防御机制研究综述", 信息网络安全, no. 05, 10 May 2017 (2017-05-10) * |
蒋彤彤等: "基于层次时空特征与多头注意力的恶意加密流量识别", 《计算机工程》, vol. 47, no. 7, 31 July 2021 (2021-07-31) * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116090537A (en) * | 2023-03-07 | 2023-05-09 | 特斯联科技集团有限公司 | Training method of commodity recommendation model |
CN116506216A (en) * | 2023-06-19 | 2023-07-28 | 国网上海能源互联网研究院有限公司 | Lightweight malicious flow detection and evidence-storage method, device, equipment and medium |
CN116506216B (en) * | 2023-06-19 | 2023-09-12 | 国网上海能源互联网研究院有限公司 | Lightweight malicious flow detection and evidence-storage method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Thing | IEEE 802.11 network anomaly detection and attack classification: A deep learning approach | |
CN115277216A (en) | Vulnerability exploitation attack encryption flow classification method based on multi-head self-attention mechanism | |
CN111340191B (en) | Bot network malicious traffic classification method and system based on ensemble learning | |
Liu et al. | A byte-level CNN method to detect DNS tunnels | |
CN108768986A (en) | A kind of encryption traffic classification method and server, computer readable storage medium | |
CN110611640A (en) | DNS protocol hidden channel detection method based on random forest | |
CN113989583A (en) | Method and system for detecting malicious traffic of internet | |
CN113364787B (en) | Botnet flow detection method based on parallel neural network | |
CN112804253B (en) | Network flow classification detection method, system and storage medium | |
CN113329023A (en) | Encrypted flow malice detection model establishing and detecting method and system | |
CN112688928A (en) | Network attack flow data enhancement method and system combining self-encoder and WGAN | |
CN113949531B (en) | Malicious encrypted flow detection method and device | |
CN109831422A (en) | A kind of encryption traffic classification method based on end-to-end sequence network | |
Fries | A fuzzy-genetic approach to network intrusion detection | |
CN104135385A (en) | Method of application classification in Tor anonymous communication flow | |
CN113821793B (en) | Multi-stage attack scene construction method and system based on graph convolution neural network | |
CN114330544A (en) | Method for establishing business flow abnormity detection model and abnormity detection method | |
Kong et al. | Identification of abnormal network traffic using support vector machine | |
Thom et al. | Smart recon: Network traffic fingerprinting for iot device identification | |
CN112507336A (en) | Server-side malicious program detection method based on code characteristics and flow behaviors | |
CN114615088A (en) | Terminal service flow abnormity detection model establishing method and abnormity detection method | |
Liu et al. | Spatial‐Temporal Feature with Dual‐Attention Mechanism for Encrypted Malicious Traffic Detection | |
CN117318980A (en) | Small sample scene-oriented self-supervision learning malicious traffic detection method | |
CN111211948A (en) | Shodan flow identification method based on load characteristics and statistical characteristics | |
Zhang et al. | Network traffic classification method based on improved capsule neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |