WO2022041394A1 - 一种网络加密流量识别方法及装置 - Google Patents

一种网络加密流量识别方法及装置 Download PDF

Info

Publication number
WO2022041394A1
WO2022041394A1 PCT/CN2020/118725 CN2020118725W WO2022041394A1 WO 2022041394 A1 WO2022041394 A1 WO 2022041394A1 CN 2020118725 W CN2020118725 W CN 2020118725W WO 2022041394 A1 WO2022041394 A1 WO 2022041394A1
Authority
WO
WIPO (PCT)
Prior art keywords
vector
traffic
network
encrypted traffic
sampling
Prior art date
Application number
PCT/CN2020/118725
Other languages
English (en)
French (fr)
Inventor
徐小龙
林焜达
Original Assignee
南京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京邮电大学 filed Critical 南京邮电大学
Publication of WO2022041394A1 publication Critical patent/WO2022041394A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Definitions

  • the invention specifically relates to a method for identifying network encrypted traffic, and also relates to a device for identifying network encrypted traffic, which belongs to the technical fields of deep learning, network traffic analysis and cyberspace security application.
  • Traffic classification is one of the most important tasks in modern network communication, but due to the popularization of encryption technology and the rapid growth of network throughput, it becomes more and more difficult to achieve high-speed and accurate identification of encrypted traffic.
  • Encrypted traffic classification is of great significance to traffic engineering, network resource management, QoS (Quality of Service), and cyberspace security management.
  • QoS Quality of Service
  • cyberspace security management In recent years, there has also been a huge demand for encrypted traffic analysis and management in new network fields such as IoT networks, software-defined networks, and mobile Internet. For the above reasons, network traffic classification has attracted more and more attention from researchers from both academia and industry.
  • the existing encrypted traffic classification solutions can be roughly divided into three types: port-based, payload-based (for example, Deep Packet Inspection, DPI for short), and statistical feature-based. Due to the prevalence of dynamic port and port masquerading techniques, the accuracy of traditional port-based traffic classification is very low.
  • the load detection method such as DPI, is similar to the regular string matching algorithm, which requires all samples in the fingerprint database to be matched with the complete traffic, so the efficiency is very low. More importantly, these fingerprints are generally difficult to be used for Identify encrypted traffic.
  • Existing work focuses more on statistical-based machine learning methods. This kind of method requires experts to manually design and extract the statistical characteristics of the traffic, so as to classify the traffic more accurately.
  • Deep learning has developed rapidly and has achieved impressive results in computer vision, natural language processing, etc., including a large number of classification problems (e.g., image classification, text sentiment analysis).
  • classification problems e.g., image classification, text sentiment analysis
  • deep learning methods are gradually applied in the network field, such as traffic classification, which can be regarded as a typical classification problem.
  • traffic classification which can be regarded as a typical classification problem.
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • the purpose of the present invention is to overcome the deficiencies in the prior art, and to provide a network encryption traffic identification method and device, which solves the problems of high time-consuming and poor real-time performance of the traffic identification algorithm caused by encryption technology in the current network environment.
  • the present invention provides a method for identifying network encrypted traffic, including the following processes:
  • the encrypted traffic to be identified is preprocessed, and the preprocessing includes: dividing the encrypted traffic flow into multiple flows; then collecting multiple continuous data packets from each flow as samples; finally performing vectorization, Standardize to obtain a formatted sample vector set;
  • the hybrid neural network model includes: a 1D-CNN network, a stacked bidirectional LSTM network, and a fully connected layer network;
  • the 1D-CNN network performs spatial feature learning on the input sample vector set, and outputs a low-dimensional feature map;
  • the stacked bidirectional LSTM network performs input
  • the feature map is used for time series feature learning, and the feature map vector containing the time series feature is obtained, and the fully connected layer determines the prediction vector according to the input feature map vector of the time series feature;
  • the predicted probability distribution of each classification is calculated based on the prediction vector, and the classification corresponding to the largest probability is taken as the final classification label of encrypted traffic.
  • the collection of multiple continuous data packets from the flow as samples includes:
  • the flow is a small flow, collect the preset number of consecutive data packets in the head of the flow to form a sample. If the existing data packets are less than the preset number, the existing data packets are selected, and the remaining packets are filled with zeros for processing;
  • each sampling point is used as a starting point, and a preset number of consecutive data packets are collected to form a sample.
  • the selection scheme of the sampling point includes three strategies: random point sampling, fixed step sampling and burst point sampling; wherein:
  • the random point sampling is random point sampling in the flow; the fixed step sampling starts sampling from the beginning of the flow with a fixed step size; the burst point sampling is to search for the burst point of the data flow in the large flow for sampling.
  • the 1D-CNN network includes:
  • the 1D-CNN network part consists of two layers of 1D-CNN convolutional layers, which perform two convolution operations on the input encrypted traffic sample vector, and perform batch normalization and nonlinearity on the new feature map output by the convolution operation in each layer. Activation and downsampling processing.
  • the training of the hybrid neural network model includes:
  • Preprocessing each encrypted traffic file includes: dividing each encrypted traffic stream into multiple streams; then collecting multiple continuous data packets from each stream as samples; Normalize and standardize, and get a formatted sample vector set as a training sample;
  • the hybrid neural network model which includes three parts: 1D-CNN network, stacked bidirectional LSTM network and fully connected layer network to obtain the best network parameters;
  • the training of the 1D-CNN network includes:
  • t is any integer from 0 to n, and refers to any data packet in the vector, which is an L-dimensional vector;
  • x represents a sample, which contains a vector of M data packets.
  • x can be regarded as containing M channels, and each channel is a two-dimensional vector of L dimension; assuming that x i:i+j represents all Channels from any position i to the bytes of i+j; the one-dimensional convolution on x operates as follows:
  • a convolutional layer contains multiple convolution kernels, and each Filter operates the same to generate a channel of the new feature map; taking any of the convolution kernels t as an example, In order to slide the window on x, b is the offset value, and f is the nonlinear activation function; The feature generated for any convolution kernel t; when the current Filter slides on x, the convolution operation of the Filter is applied to the bytes in the window. On the whole, the sequence ⁇ x 1:h ,x 2:h +1 ,...,x n-h+1:n ⁇ will generate a new feature map; all Filter operations are the same, but the parameters w and b corresponding to each Filter are different;
  • the new feature map generated on behalf of any convolution kernel t can also be regarded as the output channel t; for the new feature map of each channel, the pooling operation layer (MaxPooling) is usually used to downsample the feature map; pooling
  • the operation of the operation layer is similar to the convolution operation, and the filter is also used for the sliding operation, but the operation usually performed on each filter is Keep the maximum value in each sliding window.
  • the training of the stacked bidirectional LSTM network includes:
  • the output of the hidden layer of the previous time step, the dimension is determined by the parameters of the hidden layer dimension of the LSTM unit, assuming s; is the intermediate output of the current layer;
  • w c and b c are the parameter matrix and bias, respectively;
  • the final output needs to be determined by three gates, namely the update gate ⁇ u , the forget gate ⁇ f and the output gate ⁇ o ; the calculation formula of the gate value is as follows:
  • ⁇ o ⁇ (w o [h ⁇ t-1> ,a ⁇ t> ]+b o ) (11)
  • is the nonlinear activation function
  • w u , w f , w o and b u , b f , b o are the parameter matrix and deviation value corresponding to the three gates, respectively;
  • the calculation method of the value of the three gates is similar, their values are determined by the input a ⁇ t> of the current time step and the output h ⁇ t-1> of the previous hidden layer; the functions of the update gate ⁇ u , the forget gate ⁇ f and the output gate ⁇ o are similar
  • the and switch is used to control whether the current LSTM unit updates the current information, whether to forget the past information, and whether to output the final information; the formula for the three switches (gates) to generate the final output is as follows, where c ⁇ t> the intermediate output vector of the current layer:
  • Stacked LSTM refers to the stacking of multi-layer LSTM units, while bidirectional LSTM performs LSTM operations in the forward and reverse directions at the same time at the time step.
  • the calculation of Bi-LSTM only needs to connect the outputs in different directions of the current time step.
  • the output h ⁇ t> of the hidden layer is determined by the forward output at the time step and the reversed output connected.
  • the predicted probability distribution of each classification is obtained by calculating based on the predicted vector, including:
  • o is the original output vector of the hybrid neural network model, is the predicted probability vector after softmax processing, and o i is the vector value of the i-th position in the vector o.
  • the present invention also provides a network encrypted traffic identification device, including an encrypted traffic acquisition module, a preprocessing module, a classification prediction module and a classification identification module; wherein:
  • the encrypted traffic acquisition module is used to acquire the encrypted traffic file to be identified
  • a preprocessing module used for preprocessing the encrypted traffic to be identified, the preprocessing module includes a stream segmentation unit, a collection unit and a vectorization unit, wherein:
  • the stream splitting unit is used to split the encrypted traffic stream into multiple streams
  • a collection unit for collecting a plurality of consecutive data packets as samples from each flow
  • the vectorization unit is used to vectorize and standardize each sample to obtain a formatted sample vector set
  • the classification prediction module is used to input the sample vector set obtained after preprocessing into the pre-trained hybrid neural network model to obtain a prediction vector, and the element value in the prediction vector represents the prediction value of the encrypted traffic belonging to each classification;
  • the hybrid neural network model includes: a 1D-CNN network, a stacked bidirectional LSTM network, and a fully connected layer network;
  • the 1D-CNN network performs spatial feature learning on the input sample vector set, and outputs a low-dimensional feature map;
  • the stacked bidirectional LSTM network performs input
  • the feature map is used for time series feature learning, and the feature map vector containing the time series feature is obtained, and the fully connected layer determines the prediction vector according to the input feature map vector of the time series feature;
  • the classification identification unit is used to calculate the predicted probability distribution of each classification based on the prediction vector, and take the classification corresponding to the largest probability as the final classification label of the encrypted traffic.
  • the collection unit includes:
  • the small stream sampling unit is used to collect the preset number of continuous data packets in the stream header to form a sample. If the existing data packets are less than the preset number, the existing data packets are selected, and the remaining packets are filled with zeros for processing;
  • the large flow sampling unit is used to select several sampling points from the flow, and take each sampling point as a starting point, and collect a continuous preset number of continuous data packets to form a sample.
  • the selection scheme of sampling points includes: random point sampling, fixed step sampling and burst point sampling three strategies; wherein:
  • the random point sampling is random point sampling in the flow; the fixed step sampling starts sampling from the beginning of the flow with a fixed step size; the burst point sampling is to search for the burst point of the data flow in the large flow for sampling.
  • the samples are vectorized and standardized, including:
  • the 1D-CNN network includes:
  • the 1D-CNN network part consists of two 1D-CNN convolutional layers, which perform two convolution operations on the input encrypted traffic vector, and perform batch normalization and nonlinear activation on the new feature map output by the convolution operation in each layer. and downsampling.
  • the present invention realizes automatic extraction of traffic characteristics based on deep learning technology (CNN and RNN). Compared with the rule-based method, this method can adapt to the changes of traffic characteristics brought by different encryption technologies and obfuscation technologies.
  • the present invention proposes a hybrid neural network model, which combines CNN and RNN, uses only a small number of data packets to extract abstract features of traffic, learns spatiotemporal features of data streams, and realizes early identification of traffic.
  • the method does not require manual feature design by experts, and outperforms traditional machine learning-based recognition methods in tests on multiple real network datasets.
  • the present invention performs automatic stream segmentation, vectorization, standardization and other processing on the original encrypted traffic, and retains the timing characteristics of the stream.
  • This method effectively utilizes the spatial distribution and time series features of traffic data, realizes automatic learning of features, and realizes an end-to-end encrypted traffic identification method.
  • the method proposes an in-flow sampling scheme to solve the classification problem and data imbalance of long-term traffic.
  • Figure 1 shows the overall framework of the encrypted traffic identification method
  • FIG. 2 is a schematic diagram of a traffic vectorization method
  • Figure 3 is the overall architecture diagram of the hybrid neural network model
  • Fig. 4 is the flow chart of encrypted traffic identification method
  • Figure 5 is a schematic diagram of the detailed architecture and parameter settings of the classification model.
  • the present invention provides a network encryption traffic identification method, which is characterized in that it includes the following processes:
  • the encrypted traffic to be identified is preprocessed, and the preprocessing includes: dividing the encrypted traffic flow into multiple flows; then collecting multiple continuous data packets from each flow as samples; finally performing vectorization, Standardize to obtain a formatted sample vector set;
  • the hybrid neural network model includes: a 1D-CNN network, a stacked bidirectional LSTM network, and a fully connected layer network;
  • the 1D-CNN network performs spatial feature learning on the input sample vector set, and outputs a low-dimensional feature map;
  • the stacked bidirectional LSTM network performs input
  • the feature map is used for time series feature learning, and the feature map vector containing the time series feature is obtained, and the fully connected layer determines the prediction vector according to the input feature map vector of the time series feature;
  • the predicted probability distribution of each classification is calculated based on the prediction vector, and the classification corresponding to the largest probability is taken as the final classification label of encrypted traffic.
  • the invention utilizes the hybrid neural network technology to realize the automatic learning of the spatiotemporal characteristics of the encrypted traffic, thereby realizing the high-speed and accurate identification of the encrypted traffic.
  • the extraction of features used to identify encrypted traffic is related to traffic preprocessing methods, vectorization methods, and information on different parts of the traffic data stream.
  • the meta-information and payload information of the traffic which can provide different and effective characteristics for the identification of encrypted traffic.
  • a hybrid neural network model is designed in this method for automatic representation learning of the above information.
  • FIG. 1 is an overall frame diagram of the method of the present invention, which mainly includes two stages: a preprocessing stage and a classification stage.
  • the preprocessing stage directly converts the original traffic into standard data, which includes four steps: stream segmentation, stream sampling, vectorization, and normalization.
  • the classification stage the classification of encrypted traffic is realized by designing a hybrid neural network model to capture the spatiotemporal features of the flow, including the learning part of spatial distribution features (abstract features) and the learning part of time series features.
  • the corresponding quintuple information For each data packet in the network, according to the header information (meta information) of the data packet, the corresponding quintuple information can be found, which is expressed as follows:
  • 1Random sampling (Random Sampling): The default strategy, random sampling in the flow.
  • the sampling point si is a random point from 0 to n.
  • Step Sampling Sampling from the beginning of the flow with a fixed step size.
  • the step size is a constant of fixed length, representing a fixed step size, and the adjacent sampling points conform to the following formula.
  • 3Burst Sampling Find the burst points in the large stream for sampling.
  • different user behaviors can cause changes in the length of traffic packets, such as data transmission caused by user click behavior, which usually causes flow fluctuations.
  • Some frames that do not carry data are usually required for communication.
  • the length of TCP or UDP frames that do not carry data does not exceed 60 bytes. Therefore, Burst Sampling detects such data points and selects them as sampling points.
  • a network encryption traffic identification method of the present invention includes processes such as preprocessing, sampling, vectorization, and spatiotemporal feature learning of original traffic files.
  • the flow chart of the encrypted traffic identification scheme of the present invention is shown in FIG. 4 .
  • the specific operation steps are as follows:
  • Step 1 Perform flow segmentation on the original encrypted traffic according to the quintuple information of the traffic data packets, and obtain a flow set of data packets containing the same quintuple information.
  • the traffic collected at a node is not an ordered sequence from a single application, but a mixed sequence containing many applications.
  • collecting traffic at a certain gateway during a certain period of time may include data packets generated by all hosts passing through the gateway in the network, and the data packets are mixed into the current throughput traffic.
  • Application layer encryption mainly refers to encrypting application layer protocols of packets, such as BitTorrent and HTTPS.
  • Network layer encryption is to encrypt the entire packet above the network layer.
  • Encrypted traffic will often still contain unencrypted parts, such as the traffic's meta information. Therefore, we can perform stream segmentation on encrypted traffic according to the meta-information of the stream, but we cannot further obtain the application layer information and payload information of the encrypted part.
  • a flow refers to all packets that contain the same five-tuple (source IP, source port, destination IP, destination port, and transport layer protocol).
  • the original traffic file PCAP file, which saves network encrypted traffic
  • PCAP file which saves network encrypted traffic
  • Pi is the ith packet in F with the same quintuple.
  • a stream set consisting of streams containing data packets with the same quintuple information is obtained.
  • Label the segmented streams according to the type of the original traffic file if the PCAP file in the network traffic data set has type label information (for example, the file is marked with the traffic service type, Chat, Email, Video, etc., depending on different classification tasks), then all streams obtained from the file are marked with this type. Used to train hybrid neural network models.
  • type label information for example, the file is marked with the traffic service type, Chat, Email, Video, etc., depending on different classification tasks
  • Step 2 For the flow set obtained in Step 1, use different sampling schemes according to the flow duration (which can be divided into large flow or small flow), and sample continuous data packets from each flow to form the original training sample, and obtain the original sample set.
  • the flow duration which can be divided into large flow or small flow
  • the real network is an unbalanced environment.
  • the lengths of the flows in the network vary greatly, and the upstream and downstream traffic are usually asymmetrical.
  • the types of streams in data are divided into large streams (long-term data streams) and small streams (short-term data streams). Different streams have different durations and contain different packets. Small streams may contain tens to hundreds of packets, while large streams may contain tens of thousands to millions of packets. Large flow traffic usually takes up a lot of storage space, so it is difficult to collect enough flow samples for training if the flow is taken as the unit.
  • each small flow uses a small number of data packets at the head of the flow as a single sample, and each large flow considers the use of in-flow sampling technology to collect multiple samples from the flow to alleviate the imbalance of network traffic data.
  • the problem in existing datasets, the large-stream sample size is large, but the sample size is extremely small).
  • the flow header packet is collected, which contains most of the communication connection establishment information.
  • the in-flow sampling technique is used to select appropriate sampling points from the large flow, and select a small number of consecutive data packets from each sampling point to form individual samples, thereby solving the problem of data imbalance in large flows.
  • the formatted data helps the computer to train the model.
  • step 2 If the input stream is a small stream, use the following step 2 to execute the small stream sampling scheme. If the input stream is a large stream, go to the following step 3 to execute the in-stream sampling scheme.
  • Step 3 For the original sample set of step 2, each sample contains M data packets, each data packet retains the length of L bytes, and converts each sample into a vector of dimension (M, L), thereby converting
  • M, L vector of dimension
  • Shape uniform shape
  • each original sample obtained in step 2 contains M data packets, and each data packet retains a fixed preset length L, if it is insufficient, it is filled with all zeros, otherwise, it is truncated.
  • the default value of L is 1500. This is because the MTU (Maximum transmission Unit, that is, the maximum frame length of Ethernet) in Ethernet is 1500 bytes.
  • MTU Maximum transmission Unit, that is, the maximum frame length of Ethernet
  • each packet retains the length L by default.
  • Figure 2 shows each sample formatted in two dimensions.
  • step 2 For the sample formatted in step 1, read the binary data stream by byte, and read the 8-bit binary number in each byte in decimal to obtain an integer from 0 to 255.
  • Each raw sample is transformed into a vector of dimension (M, L).
  • step 2 In order to speed up the calculation and reduce the gradient explosion problem in deep learning, the vector obtained in step 2 is standardized. Since each byte is read as an integer number (0 to 255) in vectorization, we can directly divide these numbers by 255 for normalization to get a formatted sample set.
  • Step 4 Repeat steps 1-3 to obtain a large number of formatted training samples, and input the training samples into the hybrid neural network model for training.
  • the hybrid neural network model can extract the spatio-temporal characteristics of flows and improve the accuracy of model prediction.
  • the traditional method requires experts to manually design rules or statistical characteristics (such as flow duration, flow size, packet size, packet interval, etc.) Traffic classification.
  • the hybrid neural network model does not require manual feature design and realizes the automatic extraction of traffic features.
  • the hybrid neural network model includes a spatial feature (abstract feature) learning part and a time-series feature learning part.
  • Convolutional Neural Networks are widely used in the field of images.
  • Existing research shows that after downsampling of multi-layer CNNs, the model can learn more abstract features on the spatial distribution of images (such as local features of animal images, glasses, mouth, limbs, etc.).
  • the original vector obtained in step 3 has a higher dimension, which will bring more noise while introducing effective information, making it more difficult for the model to perform feature learning.
  • the spatial feature (abstract feature) learning part of the present invention uses a one-dimensional convolutional neural network (1D-CNN) to perform multiple downsampling, thereby reducing the feature dimension and learning the abstract features of the spatial distribution of traffic.
  • the temporal feature learning part uses stacked bidirectional LSTM (Long Short-term Memory) to capture the temporal correlation between traffic packets.
  • the present invention considers preserving the temporal dimension of the data packets in each sample during the vectorization process. Assume is the t-th data packet in a sample, and t is any integer from 0 to n, which refers to any data packet in the vector, which is an L-dimensional vector.
  • x represents a sample, which contains a vector of M data packets.
  • x can be regarded as containing M channels, and each channel is a two-dimensional vector of L dimension.
  • x i:i+j represents the bytes of all channels from any position i to i+j.
  • the one-dimensional convolution operation on x is as follows:
  • a convolutional layer contains multiple convolution kernels (Filter), and each Filter operates the same to generate a channel of the new feature map.
  • the convolution kernels t For sliding the window on x, b is the offset value, and f is the nonlinear activation function.
  • Features generated for any convolution kernel t When the current Filter slides on x, the convolution operation of the Filter is applied to the bytes in the window. On the whole, the sequence ⁇ x 1:h ,x 2:h+1 ,...,x n-h+1 :n ⁇ will generate a new feature map. All Filter operations are the same, but the parameters w and b corresponding to each Filter are different.
  • a pooling operation layer (MaxPooling) is also usually used to downsample the feature map.
  • the operation of the pooling operation layer is similar to the convolution operation. It also uses the Filter to do the sliding operation, but the operation usually performed on each Filter is: Keep the maximum value in each sliding window.
  • 1D-CNN and fully connected neural network are similar, but 1D-CNN is characterized by convolution kernel weight sharing and sparse connection, which is of great help for the operation of high-dimensional vectors.
  • 1D-CNN the traffic is down-sampled multiple times. As the layers increase, the convolution operation will produce more abstract feature maps, so the hybrid neural network model will learn more advanced from the original traffic. The abstract features on the spatial distribution of , which will help the subsequent learning of temporal features.
  • Network traffic is also a highly time-correlated data, so it is also suitable for LSTM.
  • the feature dimension is very large after the vectorization of the original network traffic file, so we consider designing the network architecture based on LSTM on the learned abstract features.
  • LSTM multiple channels of the input feature map are treated as multiple time steps. At each time step there is the following formula:
  • the output of the hidden layer at the previous time step, the dimension is determined by the parameter of the hidden layer dimension of the LSTM unit, let's say s. is the intermediate output of the current layer.
  • w c and b c are the parameter matrix and bias, respectively.
  • ⁇ o ⁇ (w o [h ⁇ t-1> ,a ⁇ t> ]+b o ) (11)
  • is the nonlinear activation function
  • w u , w f , w o and b u , b f , and b o are the parameter matrices and bias values corresponding to the three gates, respectively.
  • the calculation method of the values of the three gates is the same as The calculation method is similar, and their values are determined by the input a ⁇ t> of the current time step and the output h ⁇ t-1> of the previous hidden layer.
  • the functions of the update gate ⁇ u , the forget gate ⁇ f and the output gate ⁇ o are similar to switches, and are used to control whether the current LSTM unit updates the current information, forgets the past information, and outputs the final information.
  • the formula for the three switches (gates) to produce the final output is as follows, where c ⁇ t> the intermediate output vector of the current layer:
  • is the Hadamard Product, which represents the bitwise multiplication of vectors.
  • Stacked LSTM refers to the stacking of multiple layers of LSTM units, while bidirectional LSTM (Bi-LSTM) is forward and reverse in time steps.
  • Bi-LSTM bidirectional LSTM
  • the calculation of Bi-LSTM only needs to connect the outputs of different directions at the current time step, for example:
  • the output h ⁇ t> of the hidden layer is determined by the forward output at the time step. and the reversed output connected.
  • Figure 3 is the overall architecture diagram of the hybrid neural network model. Enter the high-dimensional vector of encrypted traffic, first use the 1D-CNN-based network for abstract spatial feature learning, and then perform spatial feature learning and downsampling on the input sample vector set through two one-dimensional convolutional layers (Conv-1, Conv-2). , get a new low-dimensional feature map.
  • Conv-1, Conv-2 two one-dimensional convolutional layers
  • the second part captures temporal features based on the stacked bidirectional LSTM network, stacking two layers of bidirectional LSTMs, and at each time step, input the vector of each channel of the feature map obtained by 1D-CNN, by stacking the bidirectional LSTM Learn the temporal features of the feature map, and obtain the feature map vector containing the temporal features.
  • the dimension of the feature map of the previous layer is converted into a c-dimensional vector, where c is the number of traffic types (such as traffic service types, Chat, Email, Video, etc., depending on different classification tasks).
  • traffic types such as traffic service types, Chat, Email, Video, etc., depending on different classification tasks.
  • the model network is first designed based on 1D-CNN, and the automatic abstract feature extraction is performed on the traffic.
  • 1D-CNN is characterized by convolution kernel weight sharing and sparse connection, which reduces the amount of parameters and is beneficial to capture similar spatial features located at different locations in the traffic data stream.
  • 1D-CNN downsamples the traffic multiple times. As the layers increase, the convolution operation will generate more abstract feature maps, and the model will learn more advanced abstract features from the original traffic, which will help Subsequent learning of temporal features.
  • the stacked LSTM refers to the stacking of multiple layers of LSTM units, while the bidirectional LSTM (Bi-LSTM) performs LSTM operations in the forward and direction at the same time at the time step. This is considering The context information of the current time step contains information in both directions before and after the current position.
  • the hybrid neural network model realizes the automatic extraction of traffic features through abstract feature learning and time series feature learning, without the need for manual feature design by experts.
  • the model For each input sample, the model first uses 1D-CNN for spatial feature learning and low sampling to obtain a low-dimensional feature map, then uses LSTM to learn to obtain a feature map containing time series features, and finally outputs a c-dimensional prediction vector o through the fully connected layer , c is the number of traffic types (such as traffic service types, Chat, Email, Video, etc., depending on different classification tasks), and each element value in the prediction vector o represents the predicted value of the encrypted traffic to be identified belonging to each classification. Since the neural network outputs the vector Contains positive and negative numbers.
  • o is the original output vector of the hybrid neural network model, is the predicted probability vector after softmax processing, and o i is the vector value of the i-th position in the vector o.
  • the calculation principle of formula 15 is: e is the natural base, and the exponential operation is used Convert o i to a positive real number. Then, divide the calculated result at each position by Calculate the predicted probability distribution vector of the model
  • the cross-entropy is used as the loss function, and the model is trained using the gradient descent algorithm.
  • Figure 5 shows the detailed parameter settings of the hybrid neural network model, which contains 13 layers (see “Tier Name”), and the 13 layers can be divided into 4 large layers (see “Tier”).
  • the figure contains the input and output vectors of each layer
  • the size (see “input”, “output”) and the amount of parameters used by each layer (see “parameters”), and the remaining convolution kernel size and stride size are the configurable parameters of 1D-CNN (see “ Convolution kernel", “step size”), the overall trainable parameter amount is 2,897,104, and the overall parameter can represent the scale of the neural network model and the size of the overall model.
  • the classification model consists of three parts:
  • the first part is the convolutional correlation layer.
  • This part contains two large convolutional layers (including Conv-1, Conv-2), each large convolutional layer contains a layer of 1D-CNN, the convolution kernel size is set to 3, and the convolution kernel moving step size is 1 , and then apply Batch Normalization to normalize the current layer output, making gradient descent easy. Then go through the activation layer (ReLU), and finally use MaxPooling for downsampling, the convolution kernel size is 2, and the convolution kernel moving step size is 2.
  • the high-dimensional vector of encrypted traffic is input for downsampling and learning of spatial features, and a new low-dimensional feature map vector is output.
  • the second part is the LSTM related structure.
  • the hidden layer dimension of each LSTM unit is set to 256. Since it is a bidirectional LSTM, it connects the outputs of the forward and reverse directions, so the output of each time step is 512 dimensions. It should be noted that the structure of stacked bidirectional LSTM is used here, so except for the last layer, the intermediate Bi-LSTM needs to retain the output of each time step. In order to alleviate the phenomenon of overfitting, a dropout layer is added after Bi-LSTM (the activation value of the final output neuron stops working with a certain probability, this probability is called dropout rate), and the dropout rate is set to 0.5.
  • the third part is the fully connected layer part.
  • the overall parameter amount is much less than that of the network based on CNN or LSTM.
  • the hybrid neural network model combines the speed of CNN and the time-step sensitivity of RNN (recurrent neural network, this method uses LSTM, a type of RNN), which makes the overall model lightweight while retaining the advantages of both.
  • the batch size is set to 128, and the Adam optimizer is used for training. Learning rate scheduling techniques can be used to help the model converge better.
  • the hybrid neural network model which includes three parts: 1D-CNN network, stacked bidirectional LSTM network and fully connected layer network to obtain the best network parameters;
  • Step 5 Obtain the encrypted traffic file to be identified, use steps 1 to 3 to process the encrypted traffic file to be identified, input the obtained sample vector into the trained hybrid neural network model, and the model outputs the original prediction vector o of encrypted traffic, o is a real vector. It is necessary to process o through softmax to obtain the predicted probability distribution of each classification is a c-dimensional vector (c is the number of traffic types), the output at the i-th position represents the probability that the sample belongs to category i, and its calculation formula is shown in formula (15). By predicting the distribution vector The final classification label label of the input traffic can be obtained.
  • label label represents a certain type of traffic (such as traffic service type, Chat, Email, Video, etc., depending on different classification tasks, all classifications are numbered from 0 ).
  • the method of the present invention comprises a preprocessing stage and a classification stage.
  • the preprocessing stage the original flow is divided, sampled, vectorized and standardized, and a sampling scheme in large flow is proposed to solve the classification problem of large flow (long-term data flow).
  • CNN is used for spatial feature capture and abstract feature extraction, and then on the basis of abstract features, stacked bidirectional LSTM is used to learn traffic time series features to achieve automatic feature extraction and efficient identification of encrypted traffic.
  • the method is versatile and can automatically extract spatiotemporal features of encrypted traffic without the need for manual feature design by experts. Moreover, it can adapt to changes in traffic characteristics caused by different encryption technologies and obfuscation technologies.
  • the present invention also provides a network encrypted traffic identification device, including an encrypted traffic acquisition module, a preprocessing module, a classification prediction module and a classification identification module; wherein:
  • the encrypted traffic acquisition module is used to acquire the encrypted traffic file to be identified
  • a preprocessing module used for preprocessing the encrypted traffic to be identified, the preprocessing module includes a stream segmentation unit, a collection unit and a vectorization unit, wherein:
  • the stream splitting unit is used to split the encrypted traffic stream into multiple streams
  • a collection unit for collecting a plurality of consecutive data packets as samples from each flow
  • the vectorization unit is used to vectorize and standardize each sample to obtain a formatted sample vector set
  • the classification prediction module is used to input the sample vector set obtained after preprocessing into the pre-trained hybrid neural network model to obtain a prediction vector, and the element value in the prediction vector represents the prediction value of the encrypted traffic belonging to each classification;
  • the hybrid neural network model includes: a 1D-CNN network, a stacked bidirectional LSTM network, and a fully connected layer network;
  • the 1D-CNN network performs spatial feature learning on the input sample vector set, and outputs a low-dimensional feature map;
  • the stacked bidirectional LSTM network performs input
  • the feature map is used for time series feature learning, and the feature map vector containing the time series feature is obtained, and the fully connected layer determines the prediction vector according to the input feature map vector of the time series feature;
  • the classification identification unit is used to calculate the predicted probability distribution of each classification based on the prediction vector, and take the classification corresponding to the largest probability as the final classification label of the encrypted traffic.
  • the collection unit includes:
  • the small stream sampling unit is used to collect the preset number of continuous data packets in the stream header to form a sample. If the existing data packets are less than the preset number, the existing data packets are selected, and the remaining packets are filled with zeros for processing;
  • the large flow sampling unit is used to select several sampling points from the flow, and take each sampling point as a starting point, and collect a continuous preset number of continuous data packets to form a sample.
  • the selection scheme of sampling points includes: random point sampling, fixed step sampling and burst point sampling three strategies; wherein:
  • the random point sampling is random point sampling in the flow; the fixed step sampling starts sampling from the beginning of the flow with a fixed step size; the burst point sampling is to search for the burst point of the data flow in the large flow for sampling.
  • the samples are vectorized and standardized, including:
  • the 1D-CNN network includes:
  • the 1D-CNN network part consists of two 1D-CNN convolutional layers, which perform two convolution operations on the input encrypted traffic vector, and perform batch normalization and nonlinear activation on the new feature map output by the convolution operation in each layer. and downsampling.
  • the device of the invention effectively utilizes the spatiotemporal characteristics of the encrypted traffic data stream, and proposes a new type of encrypted traffic hybrid neural network identification model based on the spatiotemporal characteristics of the stream.
  • the embodiments of the present application may be provided as a method, a system, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

本发明公开了一种网络加密流量识别方法及装置,该方法包含预处理阶段和分类阶段。预处理阶段对原始流量进行流切分,采样,向量化和标准化,并提出大流流中采样方案,解决大流流量的分类问题。分类阶段先使用CNN进行空间特征捕获和抽象特征抽取,然后在抽象特征的基础上使用堆叠双向LSTM学习流量时序特征,实现加密流量的自动特征提取和高效识别。该方法具有通用性,能够自动提取加密流量时空特征而无需专家手动特征设计,并且,它能够适应不同加密技术、混淆技术引起流量特征变化。

Description

一种网络加密流量识别方法及装置 技术领域
本发明具体涉及一种网络加密流量识别方法,还涉及一种网络加密流量识别装置,属于深度学习、网络流量分析和网络空间安全应用技术领域。
背景技术
流量分类是现代网络通讯中最重要的任务之一,但是由于加密技术的普及和网络吞吐量的高速增长,实现高速准确的加密流量识别变得越来越困难。加密流量分类对于流量工程、网络资源管理、QoS(Quality of Service)、网络空间安全管理等有着重要的意义。近年来,在新型网络领域例如物联网网络、软件定义网络、移动互联网中同样出现了加密流量分析管理的巨大需求。因为上述原因,网络流量分类吸引了越来越多的来自学术界、工业界两方面研究人员的注意。
近来,随着人们在安全性和隐私性方面的需求越来越高,流量加密技术逐步发展,加密流量如今已经成为了工业界普遍的做法,研究指出,到2020年将有超过83%的流量被加密。经过加密程序,流量变得随机化,这种伪随机格式使得流量的解析变得非常困难。另一方面,ISP(Internet Service Provider)通常需要对某些类型的流量进行监测或控制(例如P2P,入侵攻击等等),为了规避监测***或防火墙的检测,一些开发商使用了各种协议嵌入和流量混淆技术。显然,流量加密、混淆技术的出现,一方面满足了人们的需求,提高了安全性和隐私性,另一方面也对网络的管理提出了更大的挑战。因此,加密流量分类成为了流量工程、入侵检测等任务中的关键技术。
现有加密流量分类的解决方案大致可以分为三种:基于端口、基于荷载(例如,深度包检测,Deep Packet Inspection,简称DPI)、基于统计特征。由于动态端口和端口伪装技术的盛行,传统基于端口方法的流量分类的准确率很低。而基于荷载检测方法,如DPI,它类似于字符串正则匹配算法,需要指纹库中的所有样本都需要和完整的流量进行匹配,因而效率很低,更重要的是,这些指纹一般难以用于识别加密流量。现有的工作更多集中于基于统计的机器学习方法。这类方法需要专家手动设计、提取流量的统计特征,从而对流量进行较为准确的分类。然而,基于统计特征的机器学习方法,专家需要对不同场景下的流量设计不同的统计特征,成本很高,也无法保证提取的特征对提高分类结果的有效性。基于以上原因,这些方法难以满足人们在解决加密流量分类问题中的需求。
近来,深度学习迅速发展,在计算机视觉、自然语言处理等等领域取得了令人瞩目的成果,其中包括大量的分类问题(例如,图像分类,文本情感分析)。与此同时,深度学习方法也逐渐应用于网络领域,例如流量分类就可以当作一个典型的分类问题。在深度学习方法中,CNN(卷积神经网络)擅长捕获数据空间特征,RNN(循环神经网络)擅长捕获数据时间特征。已经有一些研究使用深度学习对加密流量进行分类,其中大多数使用CNN,在包级别上捕获流量的字节特征,但对于包与包之间,时间序列的时序特征没有很好的利用。
综上所述,当前工作中对于加密流量分类的研究仍存在以下不足:
1)随着加密技术和混淆技术的普及,流量特征容易变化,基于规则的方法(包括基于端口、基于荷载的方法)规则提取困难,流量变动之后容易失效,时间效率低。
2)基于统计的机器学习方法,手动设计特征困难,为获取更准确流量统计特征通常需要更加耗时的离线算法,实时性差。
3)基于深度学习的研究仍然较少,现有工作没有有效利用流量的时空特征。
发明内容
本发明的目的在于克服现有技术中的不足,提供了一种网络加密流量识别方法及装置,解决了当前网络环境中加密技术导致流量识别算法耗时高、实时性差等问题。
为解决上述技术问题,本发明提供了一种网络加密流量识别方法,包括以下过程:
获取多个待识别的加密流量文件;
对待识别的加密流量进行预处理,所述预处理包括:将加密流量流切分为多个流;然后从每个流中采集多个连续数据包作为样本;最后将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
进一步的,所述从流中采集多个连续数据包作为样本,包括:
若流为小流,采集流头部预设个数连续数据包组成一个样本,若已有数据包不足预设个数,则选择已有数据包,其余包补零处理;
若流为大流,从流中选取若干个采样点,以每个采样点作为起点,采集连续预设个数连续数据包组成一个样本。
进一步的,所述采样点的选取方案包括:随机点采样,固定步长采样和突发点采样三种策略;其中:
所述随机点采样为流中随机点采样;所述固定步长采样以固定的步长从流量起始开始采样;所述突发点采样为寻找大流中的数据流突发点进行采样。
进一步的,所述将样本进行向量化、标准化处理,包括:
将每个数据包保留预设长度字节数,不足则用全零补全,反之则进行截断;将每个样本转化为的向量;
对向量中每个数据进行标准化处理。
进一步的,所述1D-CNN网络,包括:
1D-CNN网络部分由两层1D-CNN卷积层组成,对输入加密流量样本向量进行两次卷积操作,并且在每一层中对卷积操作输出的新特征图进行批标准化、非线性激活和降采样处理。
进一步的,所述混合神经网络模型的训练包括:
获取多个加密流量文件,对每个加密流量文件标注出对应的分类标签,
对各个加密流量文件进行预处理,所述预处理包括:将每个加密流量流切分为多个流;然后从每个流中采集多个连续数据包作为样本;最后将每个样本进行向量化、标准化处理,得到格式化的样本向量集合作为训练样本;
利用训练样本对混合神经网络模型进行训练,其中包括1D-CNN网络、堆叠双向LSTM网络和全连接层网络三个部分,以得到最佳网络参数;
得到训练完成的混合神经网络模型。
进一步的,所述1D-CNN网络的训练包括:
在向量化过程中保留每个样本中数据包的时序维度,设
Figure PCTCN2020118725-appb-000001
为一个样本中第t个数据包,t为0到n的任一整数,代指向量中任一数据包,它是一个L维度的向量;
x=[x <1>,x <2>,…,x <M>]           (5)
x代表一个样本,它包含M个数据包的向量,在1D-CNN中,x可视为包含M个通道,每个通道都是L维的二维向量;假设x i:i+j代表全部通道从任意位置i到i+j的字节;在x上一维卷积操作如下:
Figure PCTCN2020118725-appb-000002
通常一个卷积层中包含多个卷积核,每个Filter操作相同,生成新特征图的一个通道;以其中任一卷积核t为例,
Figure PCTCN2020118725-appb-000003
为在x上滑动窗口,b为偏移值,f则是非线性的激活函数;
Figure PCTCN2020118725-appb-000004
为任一卷积核t生成的特征;当前Filter在x上滑动时,该Filter的卷积操作应用到窗口内的字节上,从整体来看,序列{x 1:h,x 2:h+1,…,x n-h+1:n}将会生新特征图;所有Filter操作相同,但是每个Filter对应的参数w和b是不同的;
Figure PCTCN2020118725-appb-000005
这里
Figure PCTCN2020118725-appb-000006
代表任一卷积核t生成的新特征图,也可视为输出通道t;对于每个通道的新特征图,通常还会使用池化操作层(MaxPooling)对特征图进行降采样;池化操作层的操作和卷积操作类似,同样是使用Filter做滑动操作,但在每个Filter上通常执行的运算为
Figure PCTCN2020118725-appb-000007
保留每个滑动窗口中的最大值。
进一步的,所述堆叠双向LSTM网络的训练包括:
在LSTM中,将输入特征图的多个通道视为多个时间步;在每个时间步上有以下公式:
Figure PCTCN2020118725-appb-000008
其中
Figure PCTCN2020118725-appb-000009
表示在输入特征图任一时间步t(即通道t)上的向量,其维度与每个时间步输入的特征图维度相同,假设为m(即1D-CNN生成的新特征维度);
Figure PCTCN2020118725-appb-000010
上一个时间步隐藏层的输出,维度由LSTM单元的隐藏层维度的参数决定,假设为s;
Figure PCTCN2020118725-appb-000011
为当前层的中间输出;w c和b c分别为参数矩阵和偏差;
在LSTM中,最终输出需要由三个门决定,分别为更新门Γ u、遗忘门Γ f和输出门Γ o;门的值计算公式如下:
Γ u=σ(w u[h <t-1>,a <t>]+b u)                (9)
Γ f=σ(w f[h <t-1>,a <t>]+b f)               (10)
Γ o=σ(w o[h <t-1>,a <t>]+b o)               (11)
其中,σ为非线性激活函数,w u、w f、w o和b u、b f、b o分别为三个门对应的参数矩阵和偏差值;
三个门的值的计算方法,与
Figure PCTCN2020118725-appb-000012
计算方法类似,它们取值由当前时间步的输入a <t>和上一层隐藏层的输出h <t-1>决定;更新门Γ u、遗忘门Γ f和输出门Γ o的作用类似与开关,用于控制当前LSTM单元是否更新当前信息,是否遗忘过去信息,是否输出最终信息;三个开关(门)产生最终输出的公式如下,其中c <t>当前层的中间输出向量:
Figure PCTCN2020118725-appb-000013
h <t>=Γ o⊙tanh(c <t>)                    (13)
由三个门分别对过去、当前和总输出做取舍;最终输出当前时间步的隐藏层输出h <t>;其中⊙为哈达玛积(Hadamard Product),表示向量之间按位相乘;
堆叠LSTM指多层LSTM单元堆叠而成,而双向LSTM则是在时间步上正向、反向同时进行LSTM运算,Bi-LSTM的计算只需要将当前时间步不同方向的输出做连接即可,如:
Figure PCTCN2020118725-appb-000014
在每个时间步上,隐藏层的输出h <t>都由时间步上正向的输出
Figure PCTCN2020118725-appb-000015
和反向的输出
Figure PCTCN2020118725-appb-000016
连接而成。
进一步的,所述基于预测向量计算得到各分类预测概率分布,包括:
计算预测概率分布,其计算公式如下:
Figure PCTCN2020118725-appb-000017
其中o为混合神经网络模型原始输出向量,
Figure PCTCN2020118725-appb-000018
则为经过softmax处理后的预测概率向量,o i为向量o中第i位置的向量值。
相应的,本发明还提供了一种网络加密流量识别装置,包括加密流量获取模块、预处理模块、分类预测模块和分类识别模块;其中:
加密流量获取模块,用于获取待识别的加密流量文件;
预处理模块,用于对待识别的加密流量进行预处理,所述预处理模块包括流切分单元、采集单元和向量化单元,其中:
流切分单元,用于将加密流量流切分为多个流;
采集单元,用于从每个流中采集多个连续数据包作为样本;
向量化单元,用于将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
分类预测模块,用于将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
分类识别单元,用于基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
进一步的,所述采集单元中,包括:
小流采样单元,用于采集流头部预设个数连续数据包组成一个样本,若已有数据包不足预设个数,则选择已有数据包,其余包补零处理;
大流采样单元,用于从流中选取若干个采样点,以每个采样点作为起点,采集连续预设个数连续数据包组成一个样本。
进一步的,所述大流采样单元中,采样点的选取方案包括:随机点采样,固定步长采样和突发点采样三种策略;其中:
所述随机点采样为流中随机点采样;所述固定步长采样以固定的步长从流量起始开始采样;所述突发点采样为寻找大流中的数据流突发点进行采样。
进一步的,所述向量化单元中将样本进行向量化、标准化处理,包括:
将每个数据包保留预设长度字节数,不足则用全零补全,反之则进行截断;将每个样本转化为的向量;
对向量中每个数据进行标准化处理。
进一步的,所述1D-CNN网络,包括:
1D-CNN网络部分由两层1D-CNN卷积层组成,对输入加密流量向量进行两次卷积操作,并且在每一层中对卷积操作输出的新特征图进行批标准化、非线性激活和降采样处理。
与现有技术相比,本发明所达到的有益效果是:
1)本发明基于深度学习的技术(CNN和RNN),实现对流量特征的自动提取。该方法具有通用性,并不针对特定的网络环境、特定的应用场景,与基于规则的方法相比,该方法可以适应不同加密技术、混淆技术带来的流量特征 变化。
2)本发明提出一种混合神经网络模型,结合CNN和RNN,仅使用少量数据包,对流量进行抽象特征抽取,学习数据流的时空特征,实现流量的早期识别。该方法不需要专家进行手动特征设计,在多个真实网络数据集的测试中,表现高于基于传统机器学习的识别方法。
3)本发明对原始加密流量进行自动流切分,向量化,标准化等处理,保留流的时序特征。该方法有效利用流量数据空间分布和时序特征,实现对特征的自动学***衡性。
附图说明
图1为加密流量识别方法的整体框架;
图2为流量向量化方法示意图;
图3为混合神经网络模型整体架构图;
图4为加密流量识别方法流程图;
图5为分类模型的详细架构及参数设置示意图。
具体实施方式
下面结合附图对本发明作进一步描述。以下实施例仅用于更加清楚地说明本发明的技术方案,而不能以此来限制本发明的保护范围。
实施例1
本发明提供了一种网络加密流量识别方法,其特征是,包括以下过程:
获取待识别的加密流量文件;
对待识别的加密流量进行预处理,所述预处理包括:将加密流量流切分为 多个流;然后从每个流中采集多个连续数据包作为样本;最后将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
本发明利用混合神经网络技术,实现对加密流量时空特征的自动学习,从而实现对加密流量的高速、准确识别。
实施例2
用于识别加密流量的特征的抽取和流量预处理方式、向量化方法、流量数据流不同部分的信息有关。例如,流量的元信息和荷载信息,它们能够提供不同且有效的特征用于加密流量的识别。在本方案中,一方面,考虑结合使用流量元信息、数据包部分荷载、数据包之间的时序特征等信息,以提高数据完备性。另一方面,该方法中设计了混合神经网络模型对上述信息进行自动表征学习。
图1为本发明方法的整体框架图,主要包含两个阶段:预处理阶段和分类阶段。预处理阶段将原始流量直接转化成标准数据,其中包含流切分,流采样,向量化,标准化四个步骤。分类阶段,通过设计混合神经网络模型捕获流时空 特征,包括空间分布特征(抽象特征)学习部分和时序特征学习部分,实现加密流量的分类。
为了方便理解本发明的技术方案,下面定义一些概念:
定义1五元组(T)。
对于网络中的每一个数据包,根据数据包的头部信息(元信息),可以找出相应的五元组信息,表达如下:
T={Src IP,Src Port,Dest IP,Dest Port,Protocol}        (1)
分别代表源地址,源端口,目的地址,目的端口和传输层协议。
定义2流中采样技术。
真实网络环境中存在大量长时间通信连接而产生长时间的大流流量,这些流可能包含几万个到几百万个数据包,单个流的体量巨大。例如匿名网络中因为使用了虚电路技术,有大量的数据包具备相同的五元组信息。在一些数据中心中,流量分析管理的重点也在于大流流量,一些文献称其为大象流。若每一个流表示为F=[P 0,P 1,…,P n],P i是F中具有相同五元组的数据包。从流中n个包中选取m个采样点,S={s 0,s 1,…,s m}。以每个采样点作为起点,采集连续M个连续数据包组成样本,
Figure PCTCN2020118725-appb-000019
其中i为0到m的整数。本发明中提出三种流采样方案。
①随机点采样(Random Sampling):默认策略,流中随机点采样。采样点s i为0到n的随机点。
s i∈{0,1,2,…,n}                       (2)
②固定步长采样(Fixed Step Sampling):以固定的步长从流量起始开始采样。其中步长为固定长度的常数,表示固定的步长,相邻采样点符合以下公式。
s 0=0                               (3)
s i=s i-1+step                          (4)
③突发点采样(Burst Sampling):寻找大流中的突发点进行采样。在大流 流量中,用户不同行为可以引起流量包长度的变化,例如用户点击行为引起数据传输,通常会引起流起伏波动。数据流变动前通常需要一些不携带数据的帧进行通信,在网络中,不携带数据的TCP或UDP帧长不超过60字节。因此,Burst Sampling检测此类数据点,选为采样点。
本发明的一种网络加密流量识别方法,其包括对原始流量文件的预处理,采样,向量化,时空特征学习等过程。本发明的加密流量识别方案流程图如图4所示。其具体操作步骤如下:
步骤1:根据流量数据包五元组信息对原始加密流量进行流切分,得到包含相同五元组信息的数据包的流集合。
真实网络环境中,在某个节点采集的流量并不是来自单一应用的有序序列,而是包含很多应用的混合序列。比如在某时间段在某个网关采集流量,可能包含该网络中经过该网关的所有主机产生的数据包,数据包混合成当前的吞吐流量。为了将各个单一类型产生的数据流进行分离,需要对数据集中提供的原始流量进行流切分。
常见的加密协议有两种,应用层加密和网络层加密。应用层加密主要指对报文的应用层协议进行加密,常见协议例如BitTorrent、Https等。网络层加密是对网络层以上整个报文进行加密。
加密后的流量通常仍会包含未加密部分,例如流量的元信息。因此,我们可以根据流的元信息来对加密流量进行流切分,但无法进一步获取加密部分的应用层信息及荷载信息。在网络中,流指的是包含相同五元组(源IP,源端口,目的IP,目的端口和传输层协议)的所有数据包。按照五元组信息对原始流量文件(PCAP文件,保存网络加密流量)进行读取、缓存、切分生成流F=[P 0,P 1,…,P i…,P n]。P i是F中具有相同五元组的第i个数据包。流切分后得到包含相同五元组信息的数据包的流组成的流集合。
按照原始流量文件的类型对切分的流进行标注:若网络流量数据集中的PCAP文件有类型标签信息(例如该文件有标注流量服务类型,Chat,Email, Video等,取决于不同分类任务),则将从该文件中获得的所有流标注为该类型。用于训练混合神经网络模型。
步骤2:对步骤1中得到的流集合,根据流持续时常(可划分为大流或小流)使用不同采样方案,从每个流中采样连续数据包组成原始训练样本,得到原始样本集合。
真实网络为不平衡环境,网络中的流的长度差距巨大,流量上行下行通常也不对称。在一些文献中,将数据中的流类型分为大流(长时间数据流)和小流(短时间数据流)。不同流的时长不同,包含的数据包也不相同。小流可能包含几十到几百个数据包,而大流则可能包含几万到几百万个数据包。大流流量通常需要占用极大的存储空间,因而若以流为单位,难以采集到足够的流样本用于训练。对于小流,通常只包含一次通信交互请求,因而连接特征多集中在前期少量数据包中;而在大流中,包含应用多次通信交互过程,所以流中数据包应该同样包含大量可用于流量分类的有效信息。因此,在本方法中,每个小流使用流头部少量数据包作为一个单独样本,而每个大流则考虑采用流中采样技术,从流中采集多个样本,缓解网络流量数据不平衡问题(在现有数据集中,大流样本容量大,但样本数量极少)。
根据流类型(大流或小流)使用不同采样方案,从每个流中选取预设个数(记为M个,M为较小常数,例如M=10)连续数据包组成一个训练样本。对于小流,采集流头部数据包,其包含大部分通信连接建立信息。对于大流,使用流中采样技术,从大流中选择适量采样点,选取每个采样点的少量连续数据包组成单独样本,从而解决大流流量数据不平衡问题。
采集得到的每个样本,都保留M个数据包的原因包括两个方面:一方面,有利于实现流量早期识别,只使用少量的包完成加密流量识别(每个流可能包含几十个到几万个包,本方法仅使用少量的包,例如M=10),使方法轻量化,极大减轻识别方法的存储需求。另一方面,在实践上,格式化的数据有助于计算机训练模型时的运算。
此步骤的具体过程为:
①若输入流为小流,则使用下面步骤②,执行小流采样方案。若输入流为大流,则进入下面步骤③,执行流中采样方案。
②对于小流,采集流头部M个数据包,F sub=[P 0,P 1,…,P M]组成一个样本,若已有数据包不足M个则全选已有数据包,其余包补零处理。
③对于大流,从流中n个包中选取m个采样点,S={s 0,s 1,…,s m}。以每个采样点作为起点,采集连续M个连续数据包组成样本,
Figure PCTCN2020118725-appb-000020
其中i为0到m的整数。采样点选取方法有三种策略:随机点采样(Random Sampling),默认策略,流中随机点采样;固定步长采样(Fixed Step Sampling),以固定的步长从流量起始开始采样;突发点采样(Burst Sampling),寻找大流中的数据流突发点进行采样。
步骤3:对于步骤2的原始样本集合,每个样本都包含M个数据包,每个数据包保留L字节的长度,将每个样本转化为维度为(M,L)的向量,从而将原始流量规整为统一形状(Shape)的便于计算机读取、运算的向量。然后对于每个样本向量进行标准化处理,得到格式化的样本向量集合,现有研究表明,数据标准化处理能够加速梯度下降,使模型快速收敛。
具体处理过程如下:
①步骤2中得到的每个原始样本都包含M个数据包,每个数据包保留固定预设长度L,不足则用全零补全,反之则进行截断。其中L默认取值1500,这是因为以太网中的MTU(Maximum transmission Unit,即以太网最大帧长)为1500字节,为了方法的通用性,每个包默认保留长度L。图2以二维的形式展示了经过格式化后的每个样本。
②对步骤①格式化后的样本,按字节读取二进制数据流,将每个字节中8位的二进制数码以十进制的方式读取,得到0到255的整数。以整数形式来表示字节,进而表示整个向量的方法,实现了对原始样本的向量化,方便运算。每个原始样本被转化为维度为(M,L)的向量。
③为了加速计算,减少深度学习中梯度***问题,对步骤②中得到向量进行标准化处理。由于向量化中将每个字节读取为一个整型数字(0到255),所以这里可以直接将这些数除以255进行标准化,得到格式化的样本集合。
步骤4:重复步骤1-步骤3得到大量格式化的训练样本,将训练样本输入混合神经网络模型进行训练。
不同流量具备不同的时序和空间分布特征(时空特征),混合神经网络模型可以提取流的时空特征,提高模型预测的准确性。传统方法需要专家根据流量未加密信息(例如数据报头部信息)、通信交互行为、荷载分布等信息,手动设计规则或统计特征(例如流时长,流大小,包大小,包间隔等)用于流量分类。混合神经网络模型不需要手动特征设计,实现了流量特征的自动提取。
混合神经网络模型包含空间特征(抽象特征)学习部分和时序特征学习部分。卷积神经网络(CNN)在图像领域被广泛应用,现有研究表明,经过多层CNN的降采样,模型能够学习图像空间分布上的更抽象特征(例如动物图像的局部特征,眼镜,嘴巴,四肢等)。步骤3得到的原始向量维度较高,引入有效信息的同时会带来更多噪声,使得模型更难进行特征学习。因此本发明中空间特征(抽象特征)学习部分使用一维卷积神经网络(1D-CNN),进行多次降采样,从而降低特征维度,学习到流量在空间分布上的抽象特征。时序特征学习部分使用堆叠双向LSTM(长短期记忆单元,Long Short-term Memory)捕获流量数据包之间的时间相关性。
为了详细阐述混合神经网络模型的设计,将从原理,整体架构,详细参数三方面详细展示模型的细节:
1)原理
本发明考虑在向量化过程中保留每个样本中数据包的时序维度。设
Figure PCTCN2020118725-appb-000021
为一个样本中第t个数据包,t为0到n的任一整数,代指向量中任一数据包,它是一个L维度的向量。
x=[x <1>,x <2>,…,x <M>]         (5)
x代表一个样本,它包含M个数据包的向量,在1D-CNN中,x可视为包含M个通道,每个通道都是L维的二维向量。假设x i:i+j代表全部通道从任意位置i到i+j的字节。在x上一维卷积操作如下:
Figure PCTCN2020118725-appb-000022
通常一个卷积层中包含多个卷积核(Filter),每个Filter操作相同,生成新特征图的一个通道。以其中任一卷积核t为例,
Figure PCTCN2020118725-appb-000023
为在x上滑动窗口,b为偏移值,f则是非线性的激活函数。
Figure PCTCN2020118725-appb-000024
为任一卷积核t生成的特征。当前Filter在x上滑动时,该Filter的卷积操作应用到窗口内的字节上,从整体来看,序列{x 1:h,x 2:h+1,…,x n-h+1:n}将会生新特征图。所有Filter操作相同,但是每个Filter对应的参数w和b是不同的。
Figure PCTCN2020118725-appb-000025
这里
Figure PCTCN2020118725-appb-000026
代表任一卷积核t生成的新特征图,也可视为输出通道t。对于每个通道的新特征图,通常还会使用池化操作层(MaxPooling)对特征图进行降采样。池化操作层的操作和卷积操作类似,同样是使用Filter做滑动操作,但在每个Filter上通常执行的运算为
Figure PCTCN2020118725-appb-000027
保留每个滑动窗口中的最大值。
本质上,1D-CNN和全连接神经网络其实是类似的,但1D-CNN的特点在于卷积核权重共享,并进行稀疏连接,这对于高维向量的运算有较大帮助。另一方面,在1D-CNN中对流量进行多次降采样,随着层次的增加,卷积操作将会产生更加抽象的特征图,因此,混合神经网络模型将会从原始流量中学习更高级的空间分布上的抽象特征,这会帮助随后的时序特征的学习。
网络流量也是一种时间相关性极强的数据,因此也适用于LSTM。但网络原始流量文件向量化之后特征维度非常大,因此我们考虑在学习到抽象特征上基于LSTM设计网络架构。在LSTM中,将输入特征图的多个通道视为多个时间步。在每个时间步上有以下公式:
Figure PCTCN2020118725-appb-000028
其中
Figure PCTCN2020118725-appb-000029
表示在输入特征图任一时间步t(即通道t)上的向量,其维度与每个时间步输入的特征图维度相同,假设为m(即1D-CNN生成的新特征维度)。
Figure PCTCN2020118725-appb-000030
上一个时间步隐藏层的输出,维度由LSTM单元的隐藏层维度的参数决定,假设为s。
Figure PCTCN2020118725-appb-000031
为当前层的中间输出。w c和b c分别为参数矩阵和偏差。
但在LSTM中,最终输出需要由三个门决定,分别为更新门Γ u、遗忘门Γ f和输出门Γ o。门的值计算公式如下:
Γ u=σ(w u[h <t-1>,a <t>]+b u)                (9)
Γ f=σ(w f[h <t-1>,a <t>]+b f)               (10)
Γ o=σ(w o[h <t-1>,a <t>]+b o)               (11)
其中,σ为非线性激活函数,w u、w f、w o和b u、b f、b o分别为三个门对应的参数矩阵和偏差值。
可以看到,三个门的值的计算方法,与
Figure PCTCN2020118725-appb-000032
计算方法类似,它们取值由当前时间步的输入a <t>和上一层隐藏层的输出h <t-1>决定。更新门Γ u、遗忘门Γ f和输出门Γ o的作用类似与开关,用于控制当前LSTM单元是否更新当前信息,是否遗忘过去信息,是否输出最终信息。三个开关(门)产生最终输出的公式如下,其中c <t>当前层的中间输出向量:
Figure PCTCN2020118725-appb-000033
h <t>=Γ o⊙tanh(c <t>)                    (13)
由三个门分别对过去、当前和总输出做取舍。最终输出当前时间步的隐藏层输出h <t>。其中⊙为哈达玛积(Hadamard Product),表示向量之间按位相乘。
由于先进行抽象特征学习,为了增强时序特征捕获能力,我们使用堆叠双向LSTM网络,堆叠LSTM指多层LSTM单元堆叠而成,而双向LSTM(Bi-LSTM)则是在时间步上正向、反向同时进行LSTM运算,这是考虑到当 前时间步的上下文信息(context),信息包含前后两个方面。Bi-LSTM的计算只需要将当前时间步不同方向的输出做连接即可,例如:
Figure PCTCN2020118725-appb-000034
即在每个时间步上,隐藏层的输出h <t>都由时间步上正向的输出
Figure PCTCN2020118725-appb-000035
和反向的输出
Figure PCTCN2020118725-appb-000036
连接而成。
2)整体架构
混合神经网络模型分为两个阶段,图3是混合神经网络模型整体的架构图。输入加密流量的高维向量,首先使用基于1D-CNN网络进行抽象空间特征学习,经过两个一维卷积层(Conv-1,Conv-2)对输入样本向量集合进行空间特征学习和降采样,得到新的低维特征图。然后在此基础上,第二部分基于堆叠双向LSTM网络捕获时序特征,堆叠两层双向LSTM,在每个时间步上,输入1D-CNN得到的特征图的每个通道的向量,通过堆叠双向LSTM学习特征图的时序特征,得到包含时序特征的特征图向量。最后,通过全连接层,将上一层特征图的维度转化为c维向量,c为流量类型数量(例如流量服务类型,Chat,Email,Video等,取决于不同分类任务)。最后可以通过softmax函数从中得到加密流量的预测标签。
抽象特征学习部分,先基于1D-CNN设计模型网络,对流量进行自动抽象特征提取。1D-CNN的特点在于卷积核权重共享,并进行稀疏连接,这降低了参数量,有利于捕获流量数据流中位于不同位置的相似空间特征。另一方面,1D-CNN对流量进行多次降采样,随着层次的增加,卷积操作将会产生更加抽象的特征图,模型将会从原始流量中学习更高级的抽象特征,这会帮助随后的时序特征的学习。
时序特征学习部分,我们使用堆叠双向LSTM网络,堆叠LSTM指多层LSTM单元堆叠而成,而双向LSTM(Bi-LSTM)则是在时间步上正向、方向同时进行LSTM运算,这是考虑到当前时间步的上下文信息包含当前位置前后两个方向的信息。
混合神经网络模型通过抽象特征学习和时序特征学习两部分实现了对流量特征的自动提取,无需专家进行手动特征设计。
对于每个输入样本,模型先使用1D-CNN进行空间特征学习和低采样,得到低维特征图,再使用LSTM学习得到包含时序特征的特征图,最后经过全连接层输出c维的预测向量o,c为流量类型数量(例如流量服务类型,Chat,Email,Video等,取决于不同分类任务),预测向量o中每个元素值代表待识别的加密流量属于各个分类的预测值。由于神经网络输出向量
Figure PCTCN2020118725-appb-000037
包含正数和负数,为了得到预测概率分布(全部概率相加为1),我们需要使用指数运算将各个预测值转化为正数,即使用softmax函数处理向量o,计算出模型的预测概率分布向量
Figure PCTCN2020118725-appb-000038
同样为c维向量,第i位置上的输出代表该样本属于分类i的概率。注意softmax不参与混合神经网络模型的训练,用于计算预测概率分布,其计算公式如下:
Figure PCTCN2020118725-appb-000039
其中o为混合神经网络模型原始输出向量,
Figure PCTCN2020118725-appb-000040
则为经过softmax处理后的预测概率向量,o i为向量o中第i位置的向量值。公式15的计算原理为:e为自然底数,利用指数运算
Figure PCTCN2020118725-appb-000041
将o i转为正实数。然后,将各个位置上的计算结果除以
Figure PCTCN2020118725-appb-000042
计算出模型的预测概率分布向量
Figure PCTCN2020118725-appb-000043
最后使用交叉熵作为损失函数,利用梯度下降算法训练模型。
3)详细参数设置
图5为混合神经网络模型的详细参数设置,包含13个层(见“层次名称”),13层又可划分为4个大的层次(见“层次”),图中包含各层输入输出向量的大小(见“输入”、“输出”)以及每一层所使用的参数量(见“参数”),剩余卷积核大小和步长的则是1D-CNN的可设定参数(见“卷积核”、“步长”),整体可训练参数量为2,897,104,整体参数可以代表神经网络模型的规模和整体模型的大小。分类模型由三部分的组成:
第一部分为卷积相关层次。这个部分包含两个大卷积层(包含Conv-1, Conv-2),每个大卷积层包含一层1D-CNN,卷积核大小设定为3,卷积核移动步长为1,然后应用批标准化(Batch Normalization)对当前层次输出进行标准化,使得梯度下降变得容易。然后经过激活层(ReLU),最后使用MaxPooling做降采样,卷积核大小为2,卷积核移动步长为2。这一部分,输入加密流量高维向量进行降采样和学习空间特征,输出新的低维特征图向量。
第二部分为LSTM相关结构。每个LSTM单元的隐藏层维度设置为256,由于为双向LSTM,它将正反两个方向的输出做了连接,所以每个时间步的输出为512维。需要注意的是,这里采用了堆叠双向LSTM的结构,则除了最后一层外,中间Bi-LSTM需要保留每个时间步的输出。为了减轻过拟合的现象,在Bi-LSTM后增加Dropout层(最终输出神经元的激活值以一定概率停止工作,这个概率称为dropout rate),设定dropout rate为0.5。
第三部分为全连接层部分。使用全连接层输入512维度,输出c维(最终输出维度和类型数相同,在图5中最终示例输出为16)。
最后使用softmax计算各个分类的预测概率。
因为先进行抽象空间特征学习再捕获双向时序特征,整体的参数量要比基于CNN或LSTM的网络参数量少很多。混合神经网络模型结合了CNN的速度和RNN(循环神经网络,本方法使用了LSTM,属于RNN的一种)的时间步敏感性,使得整体模型轻量化的同时,保留了两方面的优点。模型训练过程中batch size(批次大小)设置为128,使用Adam optimizer(Adam优化器)进行训练。可以使用learning rate(学习率)调度技术,帮助模型更好地收敛。
利用训练样本对混合神经网络模型进行训练,其中包括1D-CNN网络、堆叠双向LSTM网络和全连接层网络三个部分,以得到最佳网络参数;
步骤5:获取待识别的加密流量文件,使用步骤1-步骤3处理待识别的加密流量文件,将得到的样本向量输入训练完成的混合神经网络模型中,模型输出加密流量的原始预测向量o,o为实数向量。需要通过softmax处理o,得到各分类预测概率分布
Figure PCTCN2020118725-appb-000044
是c维向量(c为流量类型数量),第i位置上的输出代表该 样本属于分类i的概率,其计算公式如公式(15)所示。通过预测分布向量
Figure PCTCN2020118725-appb-000045
可以得到输入流量的最终分类标签label。
Figure PCTCN2020118725-appb-000046
其中
Figure PCTCN2020118725-appb-000047
为预测结果,它是c维的概率分布向量,通过argmax得到向量
Figure PCTCN2020118725-appb-000048
中最大的概率的下标(对应的分类)作为最终的分类标签label,label代表某种类型的流量(例如流量服务类型,Chat,Email,Video等,取决于不同分类任务,所有分类从0编号)。
本发明方法包含预处理阶段和分类阶段。预处理阶段对原始流量进行流切分,采样,向量化和标准化,并提出大流流中采样方案,解决大流流量(长时间数据流)的分类问题。分类阶段先使用CNN进行空间特征捕获和抽象特征抽取,然后在抽象特征的基础上使用堆叠双向LSTM学习流量时序特征,实现加密流量的自动特征提取和高效识别。该方法具有通用性,能够自动提取加密流量时空特征而无需专家手动特征设计,并且,它能够适应不同加密技术、混淆技术引起流量特征变化。
实施例3
相应的,本发明还提供了一种网络加密流量识别装置,包括加密流量获取模块、预处理模块、分类预测模块和分类识别模块;其中:
加密流量获取模块,用于获取待识别的加密流量文件;
预处理模块,用于对待识别的加密流量进行预处理,所述预处理模块包括流切分单元、采集单元和向量化单元,其中:
流切分单元,用于将加密流量流切分为多个流;
采集单元,用于从每个流中采集多个连续数据包作为样本;
向量化单元,用于将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
分类预测模块,用于将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
分类识别单元,用于基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
本实施例装置中各模块的具体实现,以及混合神经网络模型的构建、训练等内容,采取实施例2的实施方式。
进一步的,所述采集单元中,包括:
小流采样单元,用于采集流头部预设个数连续数据包组成一个样本,若已有数据包不足预设个数,则选择已有数据包,其余包补零处理;
大流采样单元,用于从流中选取若干个采样点,以每个采样点作为起点,采集连续预设个数连续数据包组成一个样本。
进一步的,所述大流采样单元中,采样点的选取方案包括:随机点采样,固定步长采样和突发点采样三种策略;其中:
所述随机点采样为流中随机点采样;所述固定步长采样以固定的步长从流量起始开始采样;所述突发点采样为寻找大流中的数据流突发点进行采样。
进一步的,所述向量化单元中将样本进行向量化、标准化处理,包括:
将每个数据包保留预设长度字节数,不足则用全零补全,反之则进行截断;将每个样本转化为的向量;
对向量中每个数据进行标准化处理。
进一步的,所述1D-CNN网络,包括:
1D-CNN网络部分由两层1D-CNN卷积层组成,对输入加密流量向量进行两次卷积操作,并且在每一层中对卷积操作输出的新特征图进行批标准化、非线性激活和降采样处理。本发明装置有效利用加密流量数据流的时空特征,提出了一种基于流时空特征的新型加密流量混合神经网络识别模型,仅使用流的少量数据包,对流量进行准确的识别。
本领域内的技术人员应明白,本申请的实施例可提供为方法、***、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变型,这些改进和变型也应视为本发明的保护范围。

Claims (10)

  1. 一种网络加密流量识别方法,其特征是,包括以下过程:
    获取待识别的加密流量文件;
    对待识别的加密流量进行预处理,所述预处理包括:将加密流量流切分为多个流;然后从每个流中采集多个连续数据包作为样本;最后将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
    将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
    所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
    基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
  2. 根据权利要求1所述的一种网络加密流量识别方法,其特征是,所述从流中采集多个连续数据包作为样本,包括:
    若流为小流,采集流头部预设个数连续数据包组成一个样本,若已有数据包不足预设个数,则选择已有数据包,其余包补零处理;
    若流为大流,从流中选取若干个采样点,以每个采样点作为起点,采集连续预设个数连续数据包组成一个样本。
  3. 根据权利要求2所述的一种网络加密流量识别方法,其特征是,所述采样点的选取方案包括:随机点采样,固定步长采样和突发点采样三种策略;其 中:
    所述随机点采样为流中随机点采样;所述固定步长采样以固定的步长从流量起始开始采样;所述突发点采样为寻找大流中的数据流突发点进行采样。
  4. 根据权利要求1所述的一种网络加密流量识别方法,其特征是,所述将样本进行向量化、标准化处理,包括:
    将每个数据包保留预设长度字节数,不足则用全零补全,反之则进行截断;将每个样本转化为的向量;
    对向量中每个数据进行标准化处理。
  5. 根据权利要求1所述的一种网络加密流量识别方法,其特征是,所述1D-CNN网络,包括:
    1D-CNN网络由两层1D-CNN卷积层组成,对输入加密流量样本向量进行两次卷积操作,并且在每一层中对卷积操作输出的新特征图进行批标准化、非线性激活和降采样处理。
  6. 根据权利要求1所述的一种网络加密流量识别方法,其特征是,所述混合神经网络模型的训练包括:
    获取多个加密流量文件,对每个加密流量文件标注出对应的分类标签;
    对各个加密流量文件进行预处理,所述预处理包括:将每个加密流量流切分为多个流;然后从每个流中采集多个连续数据包作为样本;最后将每个样本进行向量化、标准化处理,得到格式化的样本向量集合作为训练样本;
    利用训练样本对混合神经网络模型进行训练,其中包括1D-CNN网络、堆叠双向LSTM网络和全连接层网络三个部分,以得到最佳网络参数;
    得到训练完成的混合神经网络模型。
  7. 根据权利要求6所述的一种网络加密流量识别方法,其特征是,所述1D-CNN网络的训练包括:
    在向量化过程中保留每个样本中数据包的时序维度,设
    Figure PCTCN2020118725-appb-100001
    为一个样本中第t个数据包,t为0到n的任一整数,代指向量中任一数据包,它是一个L维度的向量;
    x=[x <1>,x <2>,…,x <M>]   (5)
    x代表一个样本,它包含M个数据包的向量,在1D-CNN中,x视为包含M个通道,每个通道都是L维的二维向量;假设x i:i+j代表全部通道从任意位置i到i+j的字节;在x上一维卷积操作如下:
    Figure PCTCN2020118725-appb-100002
    通常一个卷积层中包含多个卷积核,每个Filter操作相同,生成新特征图的一个通道;以其中任一卷积核t为例,
    Figure PCTCN2020118725-appb-100003
    为在x上滑动窗口,b为偏移值,f则是非线性的激活函数;
    Figure PCTCN2020118725-appb-100004
    为任一卷积核t生成的特征;
    当前Filter在x上滑动时,该Filter的卷积操作应用到窗口内的字节上,序列{x 1:h,x 2:h+1,…,x n-h+1:n}将会生新特征图;所有Filter操作相同,每个Filter对应的参数w和b是不同的;
    Figure PCTCN2020118725-appb-100005
    这里
    Figure PCTCN2020118725-appb-100006
    代表任一卷积核t生成的新特征图,也可视为输出通道t;对于每个通道的新特征图,还使用池化操作层对特征图进行降采样;池化操作层的操作同样使用Filter做滑动操作,但在每个Filter上通常执行的运算为
    Figure PCTCN2020118725-appb-100007
    保留每个滑动窗口中的最大值。
  8. 根据权利要求7所述的一种网络加密流量识别方法,其特征是,所述堆叠双向LSTM网络的训练包括:
    在LSTM中,将输入特征图的多个通道视为多个时间步;在每个时间步上有以下公式:
    Figure PCTCN2020118725-appb-100008
    其中
    Figure PCTCN2020118725-appb-100009
    表示在输入特征图任一时间步t上的向量,其维度与每个 时间步输入的特征图维度相同;
    Figure PCTCN2020118725-appb-100010
    上一个时间步隐藏层的输出,维度由LSTM单元的隐藏层维度的参数决定;
    Figure PCTCN2020118725-appb-100011
    为当前层的中间输出;w c和b c分别为参数矩阵和偏差;
    在LSTM中,最终输出由三个门决定,分别为更新门Γ u、遗忘门Γ f和输出门Γ o;门的值计算公式如下:
    Γ u=σ(w u[h <t-1>,a <t>]+b u)  (9)
    Γ f=σ(w f[h <t-1>,a <t>]+b f)   (10)
    Γ o=σ(w o[h <t-1>,a <t>]+b o)   (11)
    其中,σ为非线性激活函数,w u、w f、w o和b u、b f、b o分别为三个门对应的参数矩阵和偏差值;
    三个门的值的计算方法,与
    Figure PCTCN2020118725-appb-100012
    计算方法类似,它们取值由当前时间步的输入a <t>和上一层隐藏层的输出h <t-1>决定;更新门Γ u、遗忘门Γ f和输出门Γ o的作用用于控制当前LSTM单元是否更新当前信息,是否遗忘过去信息,是否输出最终信息;三个开关产生最终输出的公式如下,其中c <t>当前层的中间输出向量:
    Figure PCTCN2020118725-appb-100013
    h <t>=Γ o⊙tanh(c <t>)    (13)
    由三个门分别对过去、当前和总输出做取舍;最终输出当前时间步的隐藏层输出h <t>;其中⊙为哈达玛积,表示向量之间按位相乘;
    堆叠双向LSTM的输出将当前时间步不同方向的输出做连接,如:
    Figure PCTCN2020118725-appb-100014
    在每个时间步上,隐藏层的输出h <t>都由时间步上正向的输出
    Figure PCTCN2020118725-appb-100015
    和反向的输出
    Figure PCTCN2020118725-appb-100016
    连接而成。
  9. 根据权利要求1所述的一种网络加密流量识别方法,其特征是,所述基于预测向量计算得到各分类预测概率分布,包括:
    计算预测概率分布,其计算公式如下:
    Figure PCTCN2020118725-appb-100017
    其中o为混合神经网络模型原始输出向量,
    Figure PCTCN2020118725-appb-100018
    则为经过softmax处理后的预测概率向量,o i为向量o中第i位置的向量值。
  10. 一种网络加密流量识别装置,其特征是,包括加密流量获取模块、预处理模块、分类预测模块和分类识别模块;其中:
    加密流量获取模块,用于获取待识别的加密流量文件;
    预处理模块,用于对待识别的加密流量进行预处理,所述预处理模块包括流切分单元、采集单元和向量化单元,其中:
    流切分单元,用于将加密流量流切分为多个流;
    采集单元,用于从每个流中采集多个连续数据包作为样本;
    向量化单元,用于将每个样本进行向量化、标准化处理,得到格式化的样本向量集合;
    分类预测模块,用于将预处理后得到的样本向量集合输入至预设训练的混合神经网络模型,得到预测向量,此预测向量中元素值代表加密流量属于各个分类的预测值;
    所述混合神经网络模型包括:1D-CNN网络、堆叠双向LSTM网络和全连接层网络;其中1D-CNN网络对输入样本向量集合进行空间特征学习,输出低维特征图;堆叠双向LSTM网络对输入的特征图进行时序特征学习,得到包含时序特征的特征图向量,全连接层根据输入的时序特征的特征图向量确定预测向量;
    分类识别单元,用于基于预测向量计算得到各分类预测概率分布,取其中最大的概率对应的分类作为加密流量最终的分类标签。
PCT/CN2020/118725 2020-08-28 2020-09-29 一种网络加密流量识别方法及装置 WO2022041394A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010885293.1 2020-08-28
CN202010885293.1A CN112163594B (zh) 2020-08-28 2020-08-28 一种网络加密流量识别方法及装置

Publications (1)

Publication Number Publication Date
WO2022041394A1 true WO2022041394A1 (zh) 2022-03-03

Family

ID=73859335

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118725 WO2022041394A1 (zh) 2020-08-28 2020-09-29 一种网络加密流量识别方法及装置

Country Status (2)

Country Link
CN (1) CN112163594B (zh)
WO (1) WO2022041394A1 (zh)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114386079A (zh) * 2022-03-23 2022-04-22 清华大学 基于对比学习的加密流量分类方法及装置
CN114866485A (zh) * 2022-03-11 2022-08-05 南京华飞数据技术有限公司 一种基于聚合熵的网络流量分类方法及分类***
CN114978931A (zh) * 2022-07-29 2022-08-30 国电南瑞科技股份有限公司 基于流形学习的网络流量预测方法、装置及存储介质
CN114997333A (zh) * 2022-06-29 2022-09-02 清华大学 一种风力发电机的故障诊断方法和装置
CN115134176A (zh) * 2022-09-02 2022-09-30 南京航空航天大学 一种基于不完全监督的暗网加密流量分类方法
CN115150840A (zh) * 2022-05-18 2022-10-04 西安交通大学 一种基于深度学习的移动网络流量预测方法
CN115242724A (zh) * 2022-07-21 2022-10-25 东南大学 一种基于两阶段聚类的高速网络流量服务分类方法
CN115277888A (zh) * 2022-09-26 2022-11-01 中国电子科技集团公司第三十研究所 一种移动应用加密协议报文类型解析方法及***
CN115334005A (zh) * 2022-03-31 2022-11-11 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN115348074A (zh) * 2022-08-12 2022-11-15 北京航空航天大学 深度时空混合的云数据中心网络流量实时检测方法
CN115842647A (zh) * 2022-09-19 2023-03-24 上海辰锐信息科技有限公司 一种基于流量数据的网络安全威胁检测方法
CN115883263A (zh) * 2023-03-02 2023-03-31 中国电子科技集团公司第三十研究所 基于多尺度载荷语义挖掘的加密应用协议类型识别方法
CN116074087A (zh) * 2023-01-17 2023-05-05 哈尔滨工业大学 一种基于网络流量上下文表征的加密流量分类方法、电子设备及存储介质
CN116094885A (zh) * 2023-03-06 2023-05-09 青岛科技大学 基于One2ThreeNet的信号调制方式识别方法
CN116112256A (zh) * 2023-02-08 2023-05-12 电子科技大学 一种面向应用加密流量识别的数据处理方法
CN116708023A (zh) * 2023-07-28 2023-09-05 中国电信股份有限公司 流量异常检测方法、装置、电子设备和可读存储介质
CN116743506A (zh) * 2023-08-14 2023-09-12 南京信息工程大学 一种基于四元数卷积神经网络的加密流量识别方法及装置
WO2023173790A1 (zh) * 2022-03-18 2023-09-21 广州大学 一种基于数据包的加密流量分类***
CN116933114A (zh) * 2023-06-12 2023-10-24 浙江大学 一种基于cnn-lstm的直流微电网检测方法及装置
CN116994073A (zh) * 2023-09-27 2023-11-03 江西师范大学 一种自适应正负样本生成的图对比学习方法和装置
CN117313004A (zh) * 2023-11-29 2023-12-29 南京邮电大学 一种在物联网中基于深度学习的QoS流分类方法
CN118277843A (zh) * 2024-06-04 2024-07-02 之江实验室 一种多模态网络流量分类方法、装置和存储介质
CN118316603A (zh) * 2024-06-05 2024-07-09 南京信息工程大学 一种基于fpga的加密流量识别与特征提取方法及装置

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910853A (zh) * 2021-01-18 2021-06-04 南京信息工程大学 基于混合特征的加密流量分类方法
CN113037730B (zh) * 2021-02-27 2023-06-20 中国人民解放军战略支援部队信息工程大学 基于多特征学习的网络加密流量分类方法及***
CN113015167B (zh) * 2021-03-11 2023-04-07 杭州安恒信息技术股份有限公司 加密流量数据的检测方法、***、电子装置和存储介质
CN113141364B (zh) * 2021-04-22 2022-07-12 西安交通大学 一种加密流量分类方法、***、设备及可读存储介质
CN113079069B (zh) * 2021-06-04 2021-09-17 南京邮电大学 一种面向大规模加密网络流量的混合粒度训练及分类方法
CN113783795B (zh) * 2021-07-19 2023-07-25 北京邮电大学 加密流量分类方法及相关设备
CN113938290B (zh) * 2021-09-03 2022-11-11 华中科技大学 一种用户侧流量数据分析的网站去匿名方法和***
CN113824729B (zh) * 2021-09-27 2023-01-06 杭州安恒信息技术股份有限公司 一种加密流量检测方法、***及相关装置
CN113949653B (zh) * 2021-10-18 2023-07-07 中铁二院工程集团有限责任公司 一种基于深度学习的加密协议识别方法及***
CN114679606B (zh) * 2022-04-02 2023-05-09 哈尔滨工业大学 一种基于Burst特征的视频流量识别方法、***、电子设备及存储介质
CN114978585B (zh) * 2022-04-12 2024-02-27 国家计算机网络与信息安全管理中心 基于流量特征的深度学习对称加密协议识别方法
CN115174134A (zh) * 2022-05-16 2022-10-11 东南大学 一种基于加密流量分析的rtc媒体流实时应用识别方法
CN116896469B (zh) * 2023-07-18 2023-12-08 哈尔滨工业大学 一种基于Burst序列的加密代理应用识别的方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710829A (zh) * 2018-04-19 2018-10-26 北京红云智胜科技有限公司 一种基于深度学习的表情分类及微表情检测的方法
CN109905696A (zh) * 2019-01-09 2019-06-18 浙江大学 一种基于加密流量数据的视频服务体验质量的识别方法
CN110502753A (zh) * 2019-08-23 2019-11-26 昆明理工大学 一种基于语义增强的深度学习情感分析模型及其分析方法
WO2020029832A1 (en) * 2018-08-10 2020-02-13 Huawei Technologies Co., Ltd. Artificial intelligence based hierarchical service awareness engine
CN110896381A (zh) * 2019-11-25 2020-03-20 中国科学院深圳先进技术研究院 一种基于深度神经网络的流量分类方法、***及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107682216B (zh) * 2017-09-01 2018-06-05 南京南瑞集团公司 一种基于深度学习的网络流量协议识别方法
CN110197234B (zh) * 2019-06-13 2020-05-19 四川大学 一种基于双通道卷积神经网络的加密流量分类方法
CN110751222A (zh) * 2019-10-25 2020-02-04 中国科学技术大学 基于cnn和lstm的在线加密流量分类方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710829A (zh) * 2018-04-19 2018-10-26 北京红云智胜科技有限公司 一种基于深度学习的表情分类及微表情检测的方法
WO2020029832A1 (en) * 2018-08-10 2020-02-13 Huawei Technologies Co., Ltd. Artificial intelligence based hierarchical service awareness engine
CN109905696A (zh) * 2019-01-09 2019-06-18 浙江大学 一种基于加密流量数据的视频服务体验质量的识别方法
CN110502753A (zh) * 2019-08-23 2019-11-26 昆明理工大学 一种基于语义增强的深度学习情感分析模型及其分析方法
CN110896381A (zh) * 2019-11-25 2020-03-20 中国科学院深圳先进技术研究院 一种基于深度神经网络的流量分类方法、***及电子设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUANG XUANLI: "A Deep Learning-Based Spatio-Temporal Features Extraction Method for Network Flow", JOURNAL OF INTEGRATION TECHNOLOGY, KEXUE CHUBANSHE,SCIENCE PRESS, CN, vol. 9, no. 2, 31 March 2020 (2020-03-31), CN, pages 60 - 69, XP055907701, ISSN: 2095-3135, DOI: 10.12146/j.issn.2095-3135.20191231002 *

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114866485A (zh) * 2022-03-11 2022-08-05 南京华飞数据技术有限公司 一种基于聚合熵的网络流量分类方法及分类***
CN114866485B (zh) * 2022-03-11 2023-09-29 南京华飞数据技术有限公司 一种基于聚合熵的网络流量分类方法及分类***
WO2023173790A1 (zh) * 2022-03-18 2023-09-21 广州大学 一种基于数据包的加密流量分类***
CN114386079B (zh) * 2022-03-23 2022-12-06 清华大学 基于对比学习的加密流量分类方法及装置
CN114386079A (zh) * 2022-03-23 2022-04-22 清华大学 基于对比学习的加密流量分类方法及装置
CN115334005B (zh) * 2022-03-31 2024-03-22 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN115334005A (zh) * 2022-03-31 2022-11-11 北京邮电大学 基于剪枝卷积神经网络和机器学习的加密流量识别方法
CN115150840A (zh) * 2022-05-18 2022-10-04 西安交通大学 一种基于深度学习的移动网络流量预测方法
CN115150840B (zh) * 2022-05-18 2024-03-12 西安交通大学 一种基于深度学习的移动网络流量预测方法
CN114997333A (zh) * 2022-06-29 2022-09-02 清华大学 一种风力发电机的故障诊断方法和装置
CN114997333B (zh) * 2022-06-29 2024-04-23 清华大学 一种风力发电机的故障诊断方法和装置
CN115242724A (zh) * 2022-07-21 2022-10-25 东南大学 一种基于两阶段聚类的高速网络流量服务分类方法
CN115242724B (zh) * 2022-07-21 2024-05-31 东南大学 一种基于两阶段聚类的高速网络流量服务分类方法
CN114978931A (zh) * 2022-07-29 2022-08-30 国电南瑞科技股份有限公司 基于流形学习的网络流量预测方法、装置及存储介质
CN115348074A (zh) * 2022-08-12 2022-11-15 北京航空航天大学 深度时空混合的云数据中心网络流量实时检测方法
CN115134176B (zh) * 2022-09-02 2022-11-29 南京航空航天大学 一种基于不完全监督的暗网加密流量分类方法
CN115134176A (zh) * 2022-09-02 2022-09-30 南京航空航天大学 一种基于不完全监督的暗网加密流量分类方法
CN115842647A (zh) * 2022-09-19 2023-03-24 上海辰锐信息科技有限公司 一种基于流量数据的网络安全威胁检测方法
CN115277888B (zh) * 2022-09-26 2023-01-31 中国电子科技集团公司第三十研究所 一种移动应用加密协议报文类型解析方法及***
CN115277888A (zh) * 2022-09-26 2022-11-01 中国电子科技集团公司第三十研究所 一种移动应用加密协议报文类型解析方法及***
CN116074087A (zh) * 2023-01-17 2023-05-05 哈尔滨工业大学 一种基于网络流量上下文表征的加密流量分类方法、电子设备及存储介质
CN116112256A (zh) * 2023-02-08 2023-05-12 电子科技大学 一种面向应用加密流量识别的数据处理方法
CN115883263A (zh) * 2023-03-02 2023-03-31 中国电子科技集团公司第三十研究所 基于多尺度载荷语义挖掘的加密应用协议类型识别方法
CN115883263B (zh) * 2023-03-02 2023-05-09 中国电子科技集团公司第三十研究所 基于多尺度载荷语义挖掘的加密应用协议类型识别方法
CN116094885A (zh) * 2023-03-06 2023-05-09 青岛科技大学 基于One2ThreeNet的信号调制方式识别方法
CN116933114A (zh) * 2023-06-12 2023-10-24 浙江大学 一种基于cnn-lstm的直流微电网检测方法及装置
CN116708023A (zh) * 2023-07-28 2023-09-05 中国电信股份有限公司 流量异常检测方法、装置、电子设备和可读存储介质
CN116708023B (zh) * 2023-07-28 2023-10-27 中国电信股份有限公司 流量异常检测方法、装置、电子设备和可读存储介质
CN116743506A (zh) * 2023-08-14 2023-09-12 南京信息工程大学 一种基于四元数卷积神经网络的加密流量识别方法及装置
CN116743506B (zh) * 2023-08-14 2023-11-21 南京信息工程大学 一种基于四元数卷积神经网络的加密流量识别方法及装置
CN116994073A (zh) * 2023-09-27 2023-11-03 江西师范大学 一种自适应正负样本生成的图对比学习方法和装置
CN116994073B (zh) * 2023-09-27 2024-01-26 江西师范大学 一种自适应正负样本生成的图对比学习方法和装置
CN117313004B (zh) * 2023-11-29 2024-03-12 南京邮电大学 一种在物联网中基于深度学习的QoS流分类方法
CN117313004A (zh) * 2023-11-29 2023-12-29 南京邮电大学 一种在物联网中基于深度学习的QoS流分类方法
CN118277843A (zh) * 2024-06-04 2024-07-02 之江实验室 一种多模态网络流量分类方法、装置和存储介质
CN118316603A (zh) * 2024-06-05 2024-07-09 南京信息工程大学 一种基于fpga的加密流量识别与特征提取方法及装置

Also Published As

Publication number Publication date
CN112163594B (zh) 2022-07-26
CN112163594A (zh) 2021-01-01

Similar Documents

Publication Publication Date Title
WO2022041394A1 (zh) 一种网络加密流量识别方法及装置
Wang et al. A survey of techniques for mobile service encrypted traffic classification using deep learning
Rezaei et al. Deep learning for encrypted traffic classification: An overview
Song et al. Encrypted traffic classification based on text convolution neural networks
CN111064678A (zh) 基于轻量级卷积神经网络的网络流量分类方法
CN113037730A (zh) 基于多特征学习的网络加密流量分类方法及***
Soleymanpour et al. CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification
Wang et al. An encrypted traffic classification framework based on convolutional neural networks and stacked autoencoders
CN111611280A (zh) 一种基于cnn和sae的加密流量识别方法
CN113364787A (zh) 一种基于并联神经网络的僵尸网络流量检测方法
CN112491894A (zh) 一种基于时空特征学习的物联网网络攻击流量监测***
CN112910853A (zh) 基于混合特征的加密流量分类方法
CN114650229B (zh) 基于三层模型sftf-l的网络加密流量分类方法与***
CN110365659B (zh) 一种小样本场景下的网络入侵检测数据集的构造方法
Han et al. An effective encrypted traffic classification method based on pruning convolutional neural networks for cloud platform
CN114401229A (zh) 一种基于Transformer深度学习模型的加密流量识别方法
CN117633657A (zh) 基于多图表征增强实现加密应用流量识别处理的方法、装置、处理器及计算机可读存储介质
Ma et al. EETC: An extended encrypted traffic classification algorithm based on variant resnet network
Zhang et al. Transfer learning for encrypted malicious traffic detection based on efficientnet
CN116684133A (zh) 基于双层注意力和时空特征并行融合的sdn网络异常流量分类装置及方法
Dener et al. RFSE-GRU: Data balanced classification model for mobile encrypted traffic in big data environment
CN116401479A (zh) 一种基于加密流量双向突发序列的网站内容行为识别方法和***
CN114358177B (zh) 一种基于多维度特征紧凑决策边界的未知网络流量分类方法及***
Wang et al. MTC: A Multi-Task Model for Encrypted Network Traffic Classification Based on Transformer and 1D-CNN.
Zhang et al. Encrypted network traffic classification: A data driven approach

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20951062

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20951062

Country of ref document: EP

Kind code of ref document: A1