CN113723440A

CN113723440A - Encrypted TLS application traffic classification method and system on cloud platform

Info

Publication number: CN113723440A
Application number: CN202110669055.1A
Authority: CN
Inventors: 王一鹏; 云晓春; 赖英旭
Original assignee: Beijing University of Technology; National Computer Network and Information Security Management Center
Current assignee: Beijing University of Technology; National Computer Network and Information Security Management Center
Priority date: 2021-06-17
Filing date: 2021-06-17
Publication date: 2021-11-30
Anticipated expiration: 2041-06-17
Also published as: CN113723440B

Abstract

The invention discloses a classification method and a classification system for encrypted TLS application traffic on a cloud platform, wherein the method comprises a training stage and a classification stage; the training phase comprises the following steps: uniformly processing the encrypted TLS application flow samples; performing learning training on the training data to construct an application classification model; the classification stage comprises: uniformly processing the unclassified encrypted TLS application flow samples; and judging the application type of the flow sample to be detected according to the application classification model obtained in the training stage, and outputting a judgment result. The method and the system realize encrypted TLS application flow classification on a cloud platform with better accuracy and efficiency by extracting the message length sequence of the network flow and combining a threshold mechanism, a self-attention mechanism and the like.

Description

Encrypted TLS application traffic classification method and system on cloud platform

Technical Field

The invention relates to automatic classification of encrypted TLS application traffic on a cloud platform by using a deep learning technology according to message length sequence information of network flows, in particular to a method and a system for classifying the encrypted TLS application traffic on the cloud platform.

Background

Network traffic classification is a task that associates network traffic with a specific application protocol or application that it generates, and has numerous practical applications in the fields of computer networks and network security, such as network quality of service assurance (QoS), tunnel detection, network measurement, network tracing and source finding, network intrusion detection and prevention, and the like. For network management, internet service providers need to know and classify network traffic mixed by different applications to obtain better network quality of service and network configuration, e.g., using different priority policies for network traffic of different application types. Based on the above practical application requirements, in the last decade, related research aiming at the field attracts extensive attention from academic circles and industrial circles, and various advanced traffic analysis methods are continuously emerging and evolving, and the development trend also proves the persistent research interest for network traffic classification. It is worth noting that encryption technology is widely applied to ensure the privacy and security of application data transmission nowadays. With the explosive growth of mobile internet, encrypted TLS traffic rises dramatically and occupies a large share of today's internet traffic. Specifically, the two largest world mobile application markets, AppleAppStore and Google Play, require encryption of network transport data for the application. The widespread use of cryptographic techniques makes traditional load application fingerprint-based analysis methods no longer feasible, thus posing significant challenges to network security management. For mobile applications using the TLS protocol as the encryption base, the real application data they transmit is no longer plain text information from the network traffic point of view, but encrypted content hidden in the messages of the TLS protocol by an encryption algorithm. In 2019, chen et al proposed a TLS application traffic classification method named MAAF, which uses the x.509 certificate of the TLS application flow connection setup handshake phase to classify TLS application flows generated by different mobile applications. Specifically, the MAAF method analyzes the x.509 certificate in each TLS flow and extracts the "organization name" field and the "common name" field in the certificate to develop TLS application flow classification. MAAF is very effective for distinguishing application flows from different companies. Notably, different applications developed by the same company typically run on the same "cloud" platform and provide corresponding internet services to users. However, the certificate information embedded in the TLS application stream is typically the same for those TLS application streams generated by different "cloud" applications from the same company. Take six applications developed by the company Alibaba, "Paibao", "Goodpasture map", "Taobao", "Taobet", "dried shrimp music" and "Youkou" as examples; by parsing the x.509 certificate in the TLS application stream of the application, it is noted that many certificates in the TLS application stream of the six applications have the same "common name" field and the same "organization name" field; for example, the "commonName" field is ". alicdn. com" and the "organisationname" field is "alibaba (china) Technology co., ltd."; thus, MAAF methods cannot correctly distinguish such TLS application flows on "cloud" platforms. An alternative solution is to use the data message sequence or message type sequence of the TLS flow to perform encryption traffic classification; unlike directly examining the specific content generated by the mobile application, the sequence information based approach attempts to understand the generation mechanism of the different application traffic by observing state transitions in each TLS stream; in addition, inspired by the great success of the deep learning technology in the fields of voice recognition, computer vision, machine translation and the like, the invention explores the advanced deep learning technology capable of encrypting TLS traffic classification on a cloud platform.

The invention designs and realizes a novel method and a novel system for classifying the encrypted TLS application traffic on the 'cloud' platform, and the method and the system realize the classification of the encrypted TLS application traffic on the 'cloud' platform with better accuracy and efficiency by extracting the message length sequence of the network flow and combining a threshold mechanism, a self-attention mechanism and the like.

Existing mainstream encrypted traffic classification methods can be generally divided into the following two categories: (1) a flow statistics based classification method that classifies encrypted data flows by identifying flow statistics patterns (e.g., flow duration, flow idle time, distribution of packet arrival time and packet length, etc.) in externally observable attributes of the flow; (2) classification methods based on flow sequence information classify encrypted flows by identifying message/message type sequences (e.g., state transitions in message type sequences) on externally observable properties of the traffic. The present invention is a solution based on stream sequence information, so that existing related work based on stream statistics is not in the scope of discussion. The existing classification method based on stream sequence information can be subdivided into two subclasses: respectively 1) a Markov chain based classification method; and 2) deep learning based classification methods; next, two limitations of the classification method based on stream sequence information before are discussed:

(1) first, existing markov chain-based classification methods have only short-term memory, and no long-term memory, for elements in a sequence. Specifically, existing markov chain-based methods use first order homogeneous or second order homogeneous markov chain models to construct application fingerprints that can be used for TLS traffic classification. The Markov chain model for each application is constructed from the ordered TLS data message or message sequence generated by the application. Recall that state transitions in a first order homogeneous/second order homogeneous Markov chain can only consider two/three adjacent states. It is apparent that it is difficult to capture the long-term relationships between states in a given TLS stream using a low-order (e.g., first or second order) markov chain model. Therefore, these models do not perform well in the task of encryption traffic classification. Markov chain models that take into account more neighboring states (i.e., higher order Markov chain models) may alleviate the above problem to some extent. It is noted, however, that the size of the transition matrix of a markov chain model grows exponentially with its order, and thus the transition matrix of a high order markov chain model is typically very sparse. Therefore, the high-order markov chain model faces a high risk of overfitting, which makes the classification accuracy not significantly improved.

(2) Second, in recent years, a deep learning method based on a Recurrent Neural Network (RNN) has been applied to TLS application traffic classification. However, the sequential nature of RNNs is a major pain point for computation and learning using GPUs. When processing sequence data using an RNN, each hidden state in the RNN needs to have the previous hidden state as input. In practice, it is worth noting that the GPU has a large amount of computing power, but the sequential nature of RNNs makes it necessary for the GPU to wait for data to be available. Furthermore, learning long-range dependencies in sequential data remains a challenging problem for RNNs. Theoretically, advanced RNN models (e.g., LSTM and GRU) can have longer memory capabilities. However, practice has found that this long term memory still becomes ambiguous as the distance between data elements in a given sequence increases.

This patent is intended to address two technical deficiencies associated with previous methods or systems.

Disclosure of Invention

The invention aims to design and realize a method and a system for classifying encrypted TLS application traffic on a 'cloud' platform, so that in the process of classifying the encrypted TLS application traffic on the 'cloud' platform, robust feature expressions can be constructed for TLS network data streams generated by different types of applications, and further, the encrypted TLS application traffic classification with high accuracy and high efficiency can be realized.

In order to achieve the purpose, the technical scheme adopted by the invention is as follows:

a classification method for encrypted TLS application traffic on a cloud platform is characterized by comprising a training stage and a classification stage;

the training phase comprises the following steps:

1) taking encrypted TLS application flow of a known type as input, and converting each flow into a formatted message length value sequence;

2) using the message length value sequence obtained in the step 1) as input to form a training data set, and constructing an encrypted TLS application flow classification model by adopting a supervised learning mode;

the classification phase comprises the following steps:

3) taking network flow data as input, acquiring unclassified encrypted TLS application flow, and converting the encrypted TLS application flow to be detected into a message length value sequence in the same format as that in the step 1);

4) and (3) according to the encrypted TLS application traffic classification model obtained in the step 2) in the training stage, carrying out type judgment on the encrypted TLS application traffic to be classified, and outputting a judgment result.

An encrypted TLS application traffic classification system on a cloud platform comprises a traffic processing module and an application classification model building module which are used in a training stage, and a traffic processing module and an application classification module which are used in a classification stage; wherein the content of the first and second substances,

the flow processing module is responsible for carrying out statistical processing on the input protocol flow of the encrypted TLS application, and converting each flow into a message length value sequence in a specified format to be used as a training sample or a sample to be classified;

the application classification model building module in the training stage is responsible for building an application classification model of the encrypted TLS flow according to the encrypted TLS flow training sample of the known application type and performing model training to enable the application classification model to meet the training termination condition for the classification stage;

and the application classification module of the classification stage is responsible for carrying out feature extraction on the message length value sequence of the encrypted TLS flow to be classified acquired by the flow processing module according to the application classification model generated in the training stage, judging the application type of the message length value sequence and outputting a judgment result.

The key technical points of the invention are as follows:

1. a threshold mechanism structure is designed, which can form different characteristic expressions of a given sequence under different window scales, thereby adaptively generating new sequence characteristics fused with a plurality of variable length expressions for sequential input data.

2. A self-attentive mechanism is devised that learns the long-term dependencies of elements in given sequence data on each other. This mechanism captures not only the associations between adjacent elements in the sequence data (short term memory of the model to the sequence elements), but also the associations between elements that are further apart in the sequence (long term memory of the model to the sequence elements).

The method can realize the accurate classification of the encrypted TLS application flow on the cloud platform, and has the following advantages compared with the related technology which is disclosed:

1. the encrypted TLS traffic generated by different applications on the cloud platform can be effectively identified. The invention is based on the message length sequence of each TLS application flow completely, and does not need to recombine the IP data message into the TLS message.

2. The invention adopts a method of combining a threshold mechanism and a self-attention mechanism to construct the characteristic expression of the message length sequence. The design can flexibly form the weighted feature fusion expression of the sequence under different window scales for the input of the original sequence. Second, the design can also learn long-term dependencies between elements in sequence data. With the above design considerations, robust feature expressions can be automatically learned from the original sequence input for accurate TLS application traffic classification.

Drawings

FIG. 1 is a flow diagram of a training phase of a cryptographic TLS application traffic classification method.

Figure 2 is a flow diagram of the classification phase of the encryption TLS application traffic classification method.

Figure 3 is an application classification system architecture diagram for encrypted TLS traffic.

Fig. 4 is a graph of experimental results of six application protocols on a validation set.

Fig. 5 is a graph of experimental results of six application protocols on a test set.

Detailed Description

The work flow of the invention can be divided into two stages of a training stage and a classification stage. In the training stage, the encrypted TLS application traffic with the marked classes is used as a training data set to train learnable parameters in a neural network, so that automatic feature extraction and application class classification of the encrypted TLS traffic are realized. In the classification stage, based on the trained application classification model of the encrypted TLS traffic, feature extraction is carried out on the obtained unclassified encrypted TLS application traffic in the network environment, and application classification of the TLS traffic is completed.

A training stage: in the training stage of the invention, the encrypted TLS application traffic with known type is used as input and is used as a training set, and an application classification model of the encrypted TLS traffic is obtained through iterative training. The operation steps of the training phase are shown in fig. 1. For ease of understanding, the specific operation steps of the training phase of the present invention will be described later with respect to a single encrypted TLS application traffic sample.

1. Encrypted TLS application traffic handling

Given an arbitrary encrypted TLS application traffic sample flow ═ packet₁,packet₂,…,packet_kAnd f, that is, the protocol flow sample consists of k messages. Firstly, the actual byte length of each message is counted, and for the message packet_iLength of bytes s thereof_i. The traffic sample is converted into a sequence of message byte lengths S' ═ S₁,s₂,…,s_k}. Let the input length required by the classification model be l, the sequence S' needs to be adjusted to be equal to the model input length. If k is>If the sequence is not the same as the sequence, only the first item of the sequence is kept, and the subsequent items of the sequence are discarded; if k is l, no adjustment is made; if k is<l, then l-k 0 s are complemented at the tail of the sequence, so that the sequence length is l. Finally, the length value sequence S ═ S of message byte with fixed length l is obtained₁,s₂,…,s_l}。

The key of the training phase is the construction of the traffic classification model for the encryption TLS application. The encrypted TLS application traffic classification model used in the method is based on a deep neural network, and the specific implementation steps of constructing the classification model in the training stage are as follows:

2. embedded characterization

Before feature extraction, message length information in a single-value form needs to be converted into an embedded characterization vector in a vector form. In the method, a trainable embedding representation generation layer is used for generating corresponding representation vectors for different message length values. If the value range of the message length value is set to {1,2, …, v }, the input value space of the embedded representation generation layer is set to {0,1,2, …, v }, and v +1 possible values are counted, that is, the input dimension is v + 1. If d is the output dimension of the embedded representation generation layer, the length value s of any message is set_iEmbedding a length value converted into a dimension d into a vector

The sequence of message byte length values S will be converted into an encrypted TLS application traffic sample embedding vector x of dimension l x d).

It should be noted that the neural network parameters embedded in the token generation layer are trainable, and thus, as the iterative training of the classification model is applied, the vector tokens generated by the embedded token generation layer are also iterated and updated.

3. Position information embedding

In each TLS application flow, the length of each packet in the packet sequence and the sequence between packets are all closely related to the type of the application. In order to utilize the incidence relation between the position of the message in the message sequence and the application type, the invention uses the position information embedding mode to enhance the position representation capability of the sample embedding vector x. The specific operation mode of embedding the position information is as follows:

first, a corresponding position Encoding (Positional Encoding) is generated for each element in the sample embedding vector. Let a sample embed vector

Is represented by x_(pos,i)Pos is the length value embedding vector of the element

At the sequence position in the sample embedding vector x, i is the length value of the element embedding vector

Wherein pos ∈ {0,1,2, …, l-1}, and i ∈ {0,1,2, …, d-1 }. X is then_(pos,i)Position-encoded PE of_(pos,i)The calculation formula of (a) is as follows:

wherein sin is a trigonometric sine function, cos is a trigonometric cosine function, and d is a length valueInput vector

The dimension of (a), i.e., the number of elements in the length value embedding vector,

and is

For all x_(pos,i)Position embedded code PE corresponding to calculation of x_(pos,i)Obtaining a location-embedded token vector

The dimension is (l × d).

Embedding the calculated position into a characterization vector x_pAnd the sample embedding vector x is subjected to inter-element summation to obtain an initial characterization vector x':

x′＝x_p+x

4. token expansion

The representation expansion takes an initial representation vector x 'with position information as input, three 1D convolution layers with different convolution kernel sizes are used in parallel, different weights are distributed to output results of all the convolution layers, and weighted fusion is carried out, so that dimension expansion and information gain are carried out on the initial representation vector x' with low dimension. The specific operation method of the embedded extension is as follows:

(1) and (3) branch characterization expansion: conv is the number of three 1D convolution layers used for the embedded expansion operation₁、Conv₃And Conv₅The number of channels (i.e., the number of convolution kernels) of the three convolution layers is 1, and the convolution kernel sliding step length is 1. Wherein, Concv₁Has a convolution kernel size of 1, Concv₃Has a convolution kernel size of 3, Concv₅Has a convolution kernel size of 5. After the embedding expansion operation, the dimension with the dimension value of 1 in the expansion representation vector is compressed to obtainIs a set of extended token vectors of { c }₁,c₃,c₅In which any c is_iAll of the dimensions (l x d), i ∈ {1,3,5 }. ReLU is used as an activation function after each 1D convolutional layer, and the calculation formula of the ReLU activation function is as follows:

(2) branch weight generation: the weight generation operation takes the embedded token x' with position information as input, and for each extended token c in the extended token vector set_iVector generation corresponding weight g_iTo determine the ability of each augmented token to influence the final token. Weight generation operation uses 1D convolutional layer Conv_weightCalculating the weight, Conv, of each of the augmented token branches_weightHas a number of channels of 3, a convolution kernel size of 1, and an output as a set of weight vectors { g }₁,g₃,g₅In which arbitrary g_iAll of the dimensions (l x d), i ∈ {1,3,5 }.

It should be noted that, in order to ensure that the sum of the weight values of the branches on the same feature is 1, the weight values of the branches of the same feature need to be adjusted by using a softmax function. Let the coordinate of any feature in the characterization vector be (i, j), and the weights of the three branches on the feature are respectively

And

the formula for weight adjustment using the softmax algorithm is expressed as follows:

where k is ∈ {1,3,5}, and has

Is regulated byThe set of integrated weight vectors is { alpha }₁,α₃,α₅In which any α is_iAll of the dimensions (l x d), i ∈ {1,3,5 }.

(3) And (3) characterizing fusion: in expanding the token vector set c₁,c₃,c₅Is multiplied with a set of weight vectors alpha₁,α₃,α₅Performing element-by-element weighted fusion on the basis to obtain a characterization vector I of the encrypted TLS application flow, wherein the weighted fusion operation is expressed as the following formula:

I＝α₁·g₁+α₃·g₃+α₅·g₅

the dimension of the characterization vector I obtained by the final calculation is (l × d).

5. Self-attention module

The method uses a multi-element self-attention mechanism to extract key information in the characterization vector I that has a large contribution to TLS application traffic classification. In order to improve the characterization capability of the self-attention feature, the multi-unit self-attention mechanism simultaneously uses h self-attention calculation units with different learnable parameters to perform feature extraction on the feature vector I independently from each other. In each self-attention cell of the multi-cell self-attention mechanism, the token vector I needs to be converted into three different components (set to Q, K, V) in order to complete query extraction of key information. Each self-attention computing unit independently completes the generation of the self-attention characteristic component and all the self-attention characteristic components (denoted as O)₁,O₂,…,O_h) And connecting in sequence and performing weight calculation to obtain a final self-attention feature vector O.

Taking an arbitrary self-attention computing unit i (i ∈ {1,2, …, h }) as an example, generation of an attention characteristic component O from the attention computing unit i is described_iThe procedure of (2) is as follows.

(1) Calculation component generation: firstly, a characterization vector I is taken as an input, and three parameter vectors which can be learned are passed through

Converting the characterization vector I into a calculated component Q of a self-attention mechanism_i、K_i、V_iNamely:

wherein the content of the first and second substances,

has a dimension of (d × d)_q)，

Has a dimension of (d × d)_k)，

Has a dimension of (d × d)_v) And has d_q＝d_k、

In the neural network structure, the method uses three different fully-connected layers, the number of the neurons of which is d_q、d_k、 d_vRespectively accomplish by

Corresponding transformation calculations.

(2) Self-attention calculation: attention matrix A_iBy component Q_iAnd component K_iGenerating and calculating the formula as follows:

self-attention feature sub-feature O generated by the ith attention calculation unit_iI.e. attention matrix A_iFor component V_iResult of value-taking query, sub-feature O_iThe calculation of (d) is represented as follows:

O_i＝Attention(Q_i,K_i,V_i)＝V_i·A_i

the flow classification model applied by TLS completes the extraction of all self-attention features in parallel to obtain a sub-feature O₁,O₂,…,O_hAnd then, connecting all the sub-features, and obtaining the final self-attention feature O through parameter linear transformation.

Wherein the sub-feature O₁,O₂,…,O_hAll dimensions of (l × d)_v)；

The operator represents the connection of each sub-feature in the second dimension into the same vector, and the dimension of the connected backward vector is (l × h × d)_v)；W_OIs the dimension (h x d)_vX d) trainable parameter vector, resulting in dimension (l x d) from attention feature O.

After the self-attention calculation, residual fusion is performed, the input characterization vector I and the self-attention feature vector O are added element by element and subjected to layer normalization, so as to obtain a self-attention feature vector O' with a residual, the dimension of which is (l × d), and the operation is represented as follows:

O′＝LayerNorm(A+O)

it should be noted that the self-attention modules may be stacked in sequence, the output of the previous self-attention module is used as the input of the subsequent self-attention module, and the number of stacked self-attention modules may be set according to the actual requirement of feature extraction. In subsequent example demonstrations, the effect of the amount of self-attention module usage on the classification effect of encrypted TLS application traffic will also be discussed.

6. Category-sensitive feature extraction and discrimination

The method converts the characteristic vectors extracted in the preamble step into a plurality of characteristic components corresponding to all classifiable encrypted TLS application flow types, and averages each characteristic component to serve as the probability that a sample belongs to the protocol type. The detailed operation steps of feature extraction and discrimination will be described in detail in this section.

And setting the number of protocol types which can be distinguished by the encryption TLS application traffic classification model as t, wherein the method uses t mutually independent class feature calculation units to respectively calculate feature expression vectors corresponding to all classifiable encryption TLS application traffic types, thereby judging the possibility that the sample belongs to the type. Taking an arbitrary class feature calculation unit i (i is in {1,2, …, t }) as an example, a class feature vector F is described_iThe procedure of (2) is as follows.

(1) Calculation component generation: first, a self-attention feature vector O' is used as an input, and three parameter vectors which can be learned are passed

Converting the self-attention feature vector O' into a calculated component Q of the self-attention mechanism_i、K_i、V_iNamely:

wherein the content of the first and second substances,

has a dimension of (d × d)_q)，

Has a dimension of (d × d)_k)，

Has a dimension of (d × d)_v) And has d_q＝d_k. In addition, in this step d_vThe value can be designated as required, and the value adopted by the method is

In the neural network structure, the number of the neurons respectively used in the step is d_q、d_k、d_vThree full connection layers of (A) are respectively completed

Corresponding transformation calculations.

(2) Calculating class characteristics: attention matrix A_iBy component Q_iAnd component K_iGenerating and calculating the formula as follows:

class feature F generated by the ith class feature calculation unit_iI.e. attention matrix A_iAnd component V_iResult of multiplication, class feature F_iThe calculation of (d) is represented as follows:

F_i＝V_i·A_i

ith class feature F_iHas a dimension of (l × d)_v) (ii) a Class feature vector F ═ { F) composed of all class features₁,F₂,...,F_tThe dimension of (t × l × d)_v)。

(3) And (3) class discrimination generation: for any sample, the method uses the global feature average to determine the encryption TLS application traffic type to which it belongs.

First, for all class features F_iElement-by-element summation and averaging to obtain the i-th class characteristic global average value p_i：

Accordingly, the class discrimination vector of the sample is P ═ { P ═ P₁,p₂,…,p_t}. Class discrimination vector P can be converted to a probability distribution representation that sums to 1 using the softmax function.

7. Model effect assessment

After a round of classification training of the encrypted TLS application flow is completed, whether the calculation result of the neural network meets an end condition L is judged: if the calculation result meets an end condition L, terminating the training stage, and outputting a traffic classification model applied by TLS as a final result of the training stage; (b) and (3) if the calculation result does not meet the end condition, calculating a loss function value according to the network classification result, updating the neural network parameter by using back propagation, returning to the step (2) to embed the representation, and repeating the training process.

The setting of the end condition L may include, but is not limited to, the following conditions: the maximum iteration period is reached, the expected loss function value is reached, the expected statistical evaluation index is reached, and the like.

A classification stage: the classification stage of the method is established on the basis of the encryption TLS application flow classification model established in the training stage, and the protocol type of the unclassified encryption TLS application flow is judged. The encryption TLS application traffic classification model used in the classification stage is a neural network constructed in the training stage, so that the specific calculation processes of embedding position information, performing representation expansion, and performing self-attention module and encryption protocol application classification on the encryption TLS application traffic sample in the classification stage are completely consistent with those in the training stage, and are uniformly summarized as the encryption TLS application traffic classification step, which is not described herein in detail. The flow of the classification phase is shown in fig. 2 and is described as follows:

1. encrypted TLS application traffic handling: and taking the collected unclassified encrypted TLS application traffic as input, referring to the step of '1. encrypted TLS application traffic processing' in the training stage, and converting the unclassified encrypted TLS application traffic into a message length value sequence with the length fixed at k to serve as an encrypted TLS application traffic sample to be classified.

2. Encrypted TLS application traffic classification: and processing the message by using the length value sequence of the processed message as input and using an encrypted TLS application flow classification model obtained in a training stage, completing the feature extraction and type judgment of the encrypted TLS application flow to be classified, and finally outputting the type judgment corresponding to each encrypted TLS application flow.

In combination with the method for classifying the encrypted TLS application traffic on the cloud platform, the patent also discloses a system for classifying the encrypted TLS application traffic on the cloud platform. The system consists of three main modules: (1) the flow processing module is responsible for processing input encrypted TLS application flow in a training stage and a classification stage and converting each flow into a message length value sequence in a specified format; (2) the application classification model building module works in a training stage, is based on encrypted TLS application traffic training data, and is responsible for building and training an application classification model of encrypted TLS traffic; (3) and the application classification module works in a classification stage and is responsible for carrying out type judgment on the unclassified encrypted TLS application traffic on the basis of an application classification model of the encrypted TLS traffic generated in a training stage.

The workflow of the system can be divided into a training phase and a classification phase, and the architecture of the system diagram is shown in FIG. 3.

1. A training stage: in the stage, the application flow of the encrypted TLS with known type is used as input, and the flow processing module (1) counts and converts the message sequence belonging to the same TLS protocol network flow into a message length value sequence with fixed length. (2) The method comprises the steps that an application classification model building module for encrypted TLS flow firstly builds a neural network structure used by an encrypted TLS application flow classification model, and initializes or loads predefined parameters for neural network parameters; and then, in the process of multi-round iterative training, feature generation and classification are carried out on the message length value sequence, and iterative updating is carried out on the neural network parameters according to the difference between the classification result and the actual type of the encrypted TLS application flow sample. And (2) after the classification capability of the model to the encrypted TLS application traffic of the known type meets the requirement of the classification stage, (2) the encrypted TLS traffic application classification model building module outputs the encrypted TLS application traffic classification model for the classification stage.

2. A classification stage: in the stage, the application flow of the encrypted TLS to be classified is taken as input, and a flow processing module (1) counts and converts the application flow of the encrypted TLS into a message length value sequence with a fixed length. (3) The application classification module for the encrypted TLS flow performs feature generation and type judgment on a message length value sequence to be classified based on a trained application classification model for the encrypted TLS flow, and (3) the output of the encryption TLS application flow classification module is used for judging the type of an encrypted protocol flow to be classified.

The system can restart the training stage and adjust the application classification model of the encrypted TLS flow according to the requirements of TLS application flow type change, classification effect change and the like, so as to ensure that the application classification capability of the system on the encrypted TLS flow continuously meets the application scene requirements.

In the validation experiment, the invention carries out example validation on encrypted network traffic generated by six different types of applications (respectively, vote, map of hight, kuku, dried shrimp music, hungry and treasure) from the company of aleiba running on the same cloud platform. The specific network traffic information generated for each application used in the experiment is shown in table 1. Where 5 thousand TLS streams are randomly selected for each application protocol class. Thus, the experimental data totaled 3 ten thousand samples (6 classes 5000). In addition, 5-fold cross validation is carried out on the experimental data set, wherein the division ratio of the training set, the validation set and the test set is 3: 1: 1.

table 1: verifying the application name and various classes of network traffic information used in the experiment, where M represents 10⁶K represents 10³。

Application name	Number of streams	Number of messages	Number of bytes
				Tab-washing ticket	6,758	766.1K	1057.8M
High map	9,988	141.0K	123.6M
				Youke	16,509	385.9K	454.0M
Shrimp music	8,207	369.0K	485.6M
				Hungry how	9,613	210.5K	225.1M
Taobao (treasure made of Chinese herbal medicine)	7,470	230.7K	294.4M

After the classification model design of the encrypted TLS application traffic is completed, in order to evaluate the classification performance, an appropriate classification evaluation index must be defined. For the particular application r being analyzed, the following criteria are defined to evaluate the classification performance of the classifier:

(1) true Positive Rate (TPR) of application r:

(2) false Positive Rate (FPR) of applied r:

(3)TPR_rand FPR_rRespectively reflects two classification performances of the whole system, FTF_rThe indicator is TPR_rAnd FPR_rThe trade-off between them is specifically defined as follows:

(4) the invention aims at multi-class classification of mixed application flow, the overall classification performance of a system of various applications is evaluated by adopting an Accuracy index (ACCURACy), and the ACC index is specifically defined as follows:

wherein, R represents the number of application categories to be classified.

In the verification experiment, the following three important parameters are (1) k respectively, which represent the number of the first data messages of the TLS flow for classification; (2) d, representing an embedding vector dimension of the embedding layer; (3) l, represents the number of self-attentive mechanical layer repetitions. In the following evaluation experiments for the present invention, the specific parameter selection range is k ∈ {8,16,32}, d ∈ {128,256,512,1024}, and L ═ 1. Next, the experimental results of the traffic classification for a specific encrypted TLS application will be described.

Fig. 4 plots the variation of Accuracy values for the six applications on the validation set of the experimental data set when the parameters k and d have different values. ACC values varied between 89.81% and 91.89% for different parameter settings. The best parameter values on the validation data set were k 32, d 1024 and L1, with a corresponding ACC value of 91.89%. It is clear from fig. 4 that for all possible values of d, the accuacy value decreases for lower k values. In addition, it is also noted that for lower values of d, the Accuracy value of the present invention generally decreases.

Fig. 5 plots the variation of Accuracy values for the six applications in the test set of the experimental data set when the parameters k and d are different values. It is worth noting that classifying the best performing parameter settings on the validation set (k 32, d 1024 and L1) has a classification accuracy of 91.82% on the test set.

Table 2: compared with the existing encrypted TLS application flow classification method, the experimental result is compared

Table 2 shows the experimental comparison results of the present invention with the most advanced encryption TLS application traffic classification method. Table 2 shows in detail the experimental results for several evaluation indexes of TPR, FPR, average TPR, average FPR and FTF for each application. From table 2, it is apparent that the FTF value on the experimental data set of the present invention is 90.34; the FTF values of the existing TLS encryption method and system MaMPF and FS-Net are 67.56 and 87.99 respectively. Therefore, the experimental effect of the invention is superior to that of the method of MaMPF and FS-Net.

Claims

1. A classification method for encrypted TLS application traffic on a cloud platform is characterized by comprising a training stage and a classification stage;

the training phase comprises the following steps:

the classification phase comprises the following steps:

2. The method according to claim 1, wherein the specific operation method of step 1) for converting the encrypted TLS application traffic into the message length value sequence is:

1-1) extracting all messages of a complete flow from the protocol flow of the encrypted TLS application, and arranging the messages in sequence;

1-2) counting the byte length of each message in the message sequence, and arranging the message length values according to the message sequence to obtain a message length value sequence;

1-3) inputting format requirements according to an application classification model, if the total length of the message length value sequence is smaller than a specified value, filling zero at the tail part, if the total length of the message length value sequence is larger than the specified value, only keeping the front part of the message length value sequence, and discarding the length value exceeding the specified value to obtain a formatted message length value sequence;

3. the method of claim 1, wherein step 2) building the encrypted TLS application traffic classification model is by:

2-1) embedding characterization operation: taking the length value sequence of the formatted message obtained in the step 1) as input, using an embedded representation generation layer to convert each message length value into an embedded vector of a given dimension, and obtaining a sample embedded vector as output;

2-2) position information embedding operation: taking the sample embedded vector obtained in the step 2-1) as input, generating a corresponding position embedded characterization vector according to each element coordinate in the vector, and integrating the position embedded characterization vector with the sample embedded vector to obtain an initial characterization vector as output;

2-3) characterizing the augmentation operation: taking the initial characterization vector obtained in the step 2-2) as an input, using a plurality of convolution layers with different sizes to generate branch characterization vectors in parallel, and fusing to obtain the characterization vectors as an output;

2-4) self-attention feature extraction operation: taking the characterization vector obtained in the step 2-3) as an input, and performing feature extraction by using at least one layer of self-attention module to obtain a self-attention feature vector as an output;

2-5) category-sensitive feature extraction and discrimination: taking the self-attention feature vector obtained in the step 2-4) as input, and performing class feature extraction corresponding to the classifiable encryption TLS application type to obtain a class discrimination vector of an encryption TLS application flow sample as output;

2-6) taking the type discrimination obtained in the step 2-5) and the real application type of the sample as input, calculating indexes such as model classification accuracy, loss function values and the like, if the indexes meet a termination condition L, stopping the model construction process, and outputting an application classification model; and if the index does not meet the termination condition L, updating the network parameters according to the index, and repeating the step 2-1) to the step 2-6).

4. The method of claim 2, wherein the specific operation method of the embedded characterization operation of step 2-1) is:

taking the formatted message length value sequence obtained in the step 1) as input, generating a layer by using trainable embedding representation, corresponding the numerical value of the message length to a vector, converting all length values in the message length value sequence into corresponding vectors, and obtaining a sample embedding vector;

5. the method as claimed in claim 2, wherein the location information embedding operation of step 2-2) is performed by:

2-2-1) taking the sample embedded vector obtained in the step 2-1) as an input, and generating a position code with position information for any element in the vector according to the dimensional coordinates of the element in the vector. Wherein, the same message length value corresponds to the adjacent elements in the vector, and two different calculation formulas are respectively used. Converting all elements into corresponding position codes to obtain position embedded characterization vectors;

2-2-2) carrying out element-by-element addition on the position embedded characterization vector and the sample embedded vector to obtain an initial characterization vector;

6. the method as claimed in claim 2, wherein the specific operation method for characterizing the augmentation operation in step 2-3) is:

2-3-1) taking the initial token vector obtained in the step 2-2) as an input, and respectively processing the initial token vector by using a plurality of independent 1D convolution layers to obtain a plurality of branch expansion vectors. In the method, the total number of branches is 3, the sizes of convolution kernels of all 1D convolution layers are different, the number of channels (the number of the convolution kernels) is the same, and an activation function is used after convolution operation;

2-3-2) taking the initial characterization vector obtained in the step 2-2) as input, and generating a corresponding weight vector for each branch expansion vector by using a 1D convolutional layer with the convolutional kernel size of 1 and the channel number being the same as the number of branches in the step 2-3-1);

2-3-3) normalizing the weight vector to make the sum of the weights corresponding to each branch on the same characteristic value 1;

2-3-4) multiplying the three branch expansion vectors by the normalization weight vectors corresponding to the branches, and adding the weighted branch vectors element by element to obtain a characterization vector;

7. the method as claimed in claim 2, wherein the specific operation method of the self-attention feature extraction operation of step 2-4) is:

2-4-1) taking the characterization vector obtained in the step 2-3) as an input, and respectively carrying out self-attention calculation operation by using a plurality of self-attention calculation units which are independent from each other to obtain self-attention characteristic components corresponding to the self-attention calculation units;

2-4-2) connect all self-attention feature components generated by the same sample in the dimension of the feature representation of a single element in the sequence. Before and after connection, the dimension of the sequence length is kept unchanged;

2-4-3) performing matrix multiplication on the self-attention feature vector after connection by using a trainable parameter matrix to obtain a self-attention feature vector, wherein the vector dimension of the self-attention feature vector is the same as that of the characterization vector obtained in the step 2-3);

2-4-4) application of the classification model of the present method may use a plurality of self-attention feature extraction modules in sequence. If more than one self-attention feature extraction module is used in the application classification, the subsequent self-attention feature extraction operation takes the self-attention feature vector obtained by the preceding self-attention feature extraction operation as an input, and the steps 2-4-1) to 2-4-3) are repeated;

8. the method as claimed in claim 2, wherein the specific operation method of the category-sensitive feature extraction and discrimination operation in step 2-5) is:

2-5-1) taking the self-attention feature vector obtained in the step 2-4-3) as input, allocating a corresponding independent computing unit for each application category in the application classification model, and obtaining a class feature vector of the sample in the application category through class feature computing operation;

2-5-2) respectively averaging class feature vectors corresponding to each application class element by element, and taking the average value obtained by each class as a discrimination value of the sample in the class;

2-5-3) integrating the discrimination values of all categories into a vector form to obtain a category discrimination vector of the sample;

9. the method as claimed in claim 7, wherein the specific operation method of the self-attention computing operation of step 2-4-1) is:

2-4-1-1) taking the characterization vector obtained in 2-3) as input, converting the characterization vector into three different self-attention calculation components, wherein the calculation component is represented as Q, K, V, and the length of the message byte length value sequence is l, so that the dimensions of the calculation component Q, K, V are (l × d)_q)、(l×d_k)、(l×d_v) And has d_q＝d_k；

2-4-1-1) using the following formula of the self-attention algorithm, the self-attention characteristic component O is obtained:

10. the method as claimed in claim 8, wherein the specific operation method of the class feature calculation operation in step 2-5-1) is:

2-5-1-1) taking the self-attention feature vector obtained in 2-4) as an input, converting the self-attention feature vector into three different self-attention calculation components, wherein the calculation components are represented as Q, K, V, and the length of the message byte length value sequence is l, so that the dimensions of the calculation components Q, K, V are (l × d)_q)、(l×d_k)、(l×d_v) And has d_q＝d_k；

2-5-1-1) using the following formula of the self-attention algorithm to obtain a class feature vector P:

11. an encrypted TLS application traffic classification system on a cloud platform is characterized by comprising a traffic processing module and an application classification model building module which are used in a training stage, and a traffic processing module and an application classification module which are used in a classification stage; wherein the content of the first and second substances,