CN117633657A - Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement - Google Patents

Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement Download PDF

Info

Publication number
CN117633657A
CN117633657A CN202311805721.5A CN202311805721A CN117633657A CN 117633657 A CN117633657 A CN 117633657A CN 202311805721 A CN202311805721 A CN 202311805721A CN 117633657 A CN117633657 A CN 117633657A
Authority
CN
China
Prior art keywords
graph
session
data packet
flow
encrypted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311805721.5A
Other languages
Chinese (zh)
Inventor
王志宏
杨莹
朱彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Third Research Institute of the Ministry of Public Security
Original Assignee
Third Research Institute of the Ministry of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Third Research Institute of the Ministry of Public Security filed Critical Third Research Institute of the Ministry of Public Security
Priority to CN202311805721.5A priority Critical patent/CN117633657A/en
Publication of CN117633657A publication Critical patent/CN117633657A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method for realizing encryption application flow identification processing based on multi-graph characterization enhancement, which comprises the following steps: constructing a data packet graph based on multi-type interaction information; constructing a session flow graph based on the association relation of the flow sequences; traffic classification is applied based on encryption of hierarchical graph convolutional networks. The invention also relates to a device, a processor and a computer readable storage medium thereof for realizing the encryption application traffic identification processing based on the multi-graph characterization enhancement. The method, the device, the processor and the computer readable storage medium for realizing the encryption application flow identification processing based on multi-graph feature enhancement solve the problems that the existing encryption flow classification algorithm feature construction based on deep learning is easy to attack, and inter-session semantic association is ignored. The method creatively builds the data packet graph and the session flow graph, fully mines the session flow and the information between the session flows, and has certain innovativeness.

Description

Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement
Technical Field
The invention relates to the technical field of graph neural network processing in deep learning, in particular to the field of encryption application flow classification, and specifically relates to a method, a device, a processor and a computer readable storage medium for realizing encryption application flow identification processing based on multi-graph feature enhancement.
Background
Encryption application traffic classification is an important problem in the field of network security supervision, and encrypted communication not only can effectively protect the security of data transmission, but also can block most of invasive attacks and interception. But this also presents challenges to network security administration, so classification and identification of encrypted application traffic is one of the key technologies that strengthen network security administration.
In the aspect of encryption application traffic classification, the existing algorithm can be classified according to the working principle and the classification method, and mainly comprises the following classes: (1) rule-based algorithm: and constructing a rule set based on expert experience or priori knowledge, so as to judge the characteristics of the format, the structure and the like of the transmission message and classify the flow. Such algorithms do not require modeling training and are therefore faster, but the accuracy of classification and applicable scenarios are limited and require reliance on human experience. (2) algorithms based on traditional machine learning: features are extracted from the encrypted traffic by using methods such as statistics or machine learning, and then the extracted features are matched and classified. Such algorithms require the creation of an appropriate feature representation for the data set to improve classification accuracy, but still suffer from limitations such as the extracted features being relevant to a particular encryption algorithm. (3) deep learning-based algorithm: modeling is performed based on deep learning models such as Convolutional Neural Network (CNN) and cyclic neural network (RNN), abstract and high-dimensional characteristic representation can be learned from original data, and encryption traffic is classified. Such algorithms require a large number of data samples and computational resources, but classification works better than other methods.
The three methods all achieve good effects in the field of encryption application flow classification, and particularly have more obvious effects based on an algorithm of deep learning. However, the existing encryption application traffic classification algorithm based on deep learning has the following problems: (1) sequence features that are more focused on a single session stream. The encrypted traffic sequence is converted into a gray level image, and the CNN model and other models are adopted to perform feature learning on the gray level image and finish final classification. However, since the gray level map is constructed at the risk of being attacked, that is, a small disturbance (data packet) is added to the original traffic, the gray level map of the traffic is greatly affected, so that a spoofing model is achieved, and classification errors are caused. (2) ignoring semantic associations between session streams. The existing algorithm model focuses more on the data characteristics of a single session stream, ignores rich semantic relations among a plurality of encryption sessions, namely does not perform association analysis on the plurality of encryption sessions with related relations, and therefore the existing session stream characteristics are limited to the single session stream.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a method, a device, a processor and a computer readable storage medium thereof for realizing encryption application flow identification processing based on multi-chart characteristic enhancement, which are high in precision, simple and convenient to operate and wide in application range.
To achieve the above object, a method, an apparatus, a processor, and a computer readable storage medium thereof for implementing encrypted application traffic identification processing based on multi-graph feature enhancement according to the present invention are as follows:
the method for realizing encryption application flow identification processing based on multi-graph characterization enhancement is mainly characterized by comprising the following steps:
(1) All session stream data of an encrypted application flow are taken out from the encrypted application flow data set;
(2) Constructing a data packet diagram;
(3) Introducing a graph rolling network into the data packet graph, and continuously updating the node state information of the data packet;
(4) Representing an original encrypted flow gray level diagram, and selecting bytes with proper length for representing construction;
(5) Representing an original encrypted traffic gray map rolling network;
(6) Constructing a session flow graph;
(7) Representing a session flow graph convolution, and introducing a graph convolution network into the session flow graph;
(8) Calculating encryption application traffic classification;
(9) And carrying out encryption application traffic classification prediction.
Preferably, the step (2) specifically includes the following steps:
(2.1) dividing the original traffic according to the granularity of the session, and extracting basic information for constructing a data packet graph from the session flow;
(2.2) defining each data packet in the single session stream as a node in the data packet map;
(2.3) defining the transmission direction of the first data packet as a forward direction, wherein the direction of the subsequent data packet is the same as that of the first data packet to be a positive value, and otherwise, the direction of the subsequent data packet is a negative value;
and (2.4) transmitting continuous data packets in the same direction as clusters, dividing the edges of the data packet into inner and outer clusters according to time sequence interaction and access interaction information of the data packets in the session stream, and adopting a full connection mode between different clusters.
Preferably, the step (5) specifically includes the following steps:
(5.1) converting the handshake information byte stream into a gray scale map, mapping the original bytes to fixed length features using an ebedding operation;
and (5.2) processing the gray level map by using one-dimensional convolution operation, and acquiring context associated information of each byte to obtain richer semantic representation information.
Preferably, the step (6) specifically includes the following steps:
(6.1) processing the encrypted traffic at the session granularity, dividing the encrypted session according to the same five-tuple, and deleting the unencrypted session stream and the incomplete session stream;
(6.2) taking the complete session flow in the encrypted traffic as a node of a session flow graph;
(6.3) defining edges in the session flow graph based on the visited network service and the packet sequence similarity.
Preferably, in the step (3), the state information of the node of the data packet is updated, specifically:
updating the state information of the data packet node according to the following formula:
wherein,for the adjacency matrix of the data packet map, D is +.>Is set to the degree matrix of the initial feature matrixV i,m An embedded representation of the packet based on the packet payload.
Preferably, in the step (8), the encryption application traffic classification is calculated, specifically:
the encryption application traffic classification is calculated according to the following formula:
wherein,the probability that the encrypted session stream i belongs to class C is represented, and C represents the class number of the encrypted traffic.
The device for realizing encryption application flow identification processing based on multi-graph characterization enhancement is mainly characterized by comprising the following components:
a processor configured to execute computer-executable instructions;
and a memory storing one or more computer-executable instructions which, when executed by the processor, perform the steps of the method for implementing encrypted application traffic identification processing based on multi-graph feature enhancement described above.
The processor for realizing the encryption application flow identification processing based on the multi-graph feature enhancement is mainly characterized in that the processor is configured to execute computer executable instructions, and when the computer executable instructions are executed by the processor, the steps of the method for realizing the encryption application flow identification processing based on the multi-graph feature enhancement are realized.
The computer readable storage medium is characterized in that the computer program is stored thereon, and the computer program can be executed by a processor to implement the steps of the method for implementing the encryption application flow identification processing based on multi-graph feature enhancement.
The method, the device, the processor and the computer readable storage medium for realizing the encryption application flow identification processing based on multi-graph feature enhancement solve the problems that the existing encryption flow classification algorithm feature construction based on deep learning is easy to attack, and inter-session semantic association is ignored. Firstly, constructing a data packet-level encryption session topological graph based on the interactive characteristics of data packet load length, direction, packet sequence, cluster information and the like so as to fully mine session flow information in encryption application traffic. Further, the characterization limit of a single session flow is broken through, and an encryption application session flow graph based on the association relation of the flow sequences is constructed based on the fact that access network services among encryption sessions are identical and data packet sequences are similar. And finally, introducing a hierarchical graph rolling network, and performing characterization learning on a data packet graph constructed based on a single session and a session flow graph constructed based on a plurality of sessions, so that the problems of insufficient characterization of a single session flow and the like are solved, and high-precision identification and classification of encrypted flows are realized. The method creatively builds the data packet graph and the session flow graph, fully mines the session flow and the information between the session flows, and has certain innovativeness.
Drawings
Fig. 1 is a schematic basic structure diagram of a method for implementing encryption application traffic identification processing based on multi-graph feature enhancement according to the present invention.
FIG. 2 is a flow chart of an embodiment of a method of implementing an encrypted application traffic identification process based on multi-graph feature enhancement of the present invention.
Detailed Description
In order to more clearly describe the technical contents of the present invention, a further description will be made below in connection with specific embodiments.
The method for realizing encryption application flow identification processing based on multi-graph characterization enhancement comprises the following steps:
(1) All session stream data of an encrypted application flow are taken out from the encrypted application flow data set;
(2) Constructing a data packet diagram;
(3) Introducing a graph rolling network into the data packet graph, and continuously updating the node state information of the data packet;
(4) Representing an original encrypted flow gray level diagram, and selecting bytes with proper length for representing construction;
(5) Representing an original encrypted traffic gray map rolling network;
(6) Constructing a session flow graph;
(7) Representing a session flow graph convolution, and introducing a graph convolution network into the session flow graph;
(8) Calculating encryption application traffic classification;
(9) And carrying out encryption application traffic classification prediction.
As a preferred embodiment of the present invention, the step (2) specifically includes the following steps:
(2.1) dividing the original traffic according to the granularity of the session, and extracting basic information for constructing a data packet graph from the session flow;
(2.2) defining each data packet in the single session stream as a node in the data packet map;
(2.3) defining the transmission direction of the first data packet as a forward direction, wherein the direction of the subsequent data packet is the same as that of the first data packet to be a positive value, and otherwise, the direction of the subsequent data packet is a negative value;
and (2.4) transmitting continuous data packets in the same direction as clusters, dividing the edges of the data packet into inner and outer clusters according to time sequence interaction and access interaction information of the data packets in the session stream, and adopting a full connection mode between different clusters.
As a preferred embodiment of the present invention, the step (5) specifically includes the steps of:
(5.1) converting the handshake information byte stream into a gray scale map, mapping the original bytes to fixed length features using an ebedding operation;
and (5.2) processing the gray level map by using one-dimensional convolution operation, and acquiring context associated information of each byte to obtain richer semantic representation information.
As a preferred embodiment of the present invention, the step (6) specifically includes the steps of:
(6.1) processing the encrypted traffic at the session granularity, dividing the encrypted session according to the same five-tuple, and deleting the unencrypted session stream and the incomplete session stream;
(6.2) taking the complete session flow in the encrypted traffic as a node of a session flow graph;
(6.3) defining edges in the session flow graph based on the visited network service and the packet sequence similarity.
As a preferred embodiment of the present invention, the updating the state information of the packet node in the step (3) specifically includes:
updating the state information of the data packet node according to the following formula:
wherein,for the adjacency matrix of the data packet map, D is +.>Is set to the degree matrix of the initial feature matrixV i,m An embedded representation of the packet based on the packet payload.
As a preferred embodiment of the present invention, the calculating encryption application traffic classification in the step (8) specifically includes:
the encryption application traffic classification is calculated according to the following formula:
wherein,the probability that the encrypted session stream i belongs to class C is represented, and C represents the class number of the encrypted traffic.
The device for realizing encryption application flow identification processing based on multi-graph characterization enhancement is mainly characterized by comprising the following components:
a processor configured to execute computer-executable instructions;
and a memory storing one or more computer-executable instructions which, when executed by the processor, perform the steps of the method for implementing encrypted application traffic identification processing based on multi-graph feature enhancement described above.
The processor for implementing the encrypted application traffic identification process based on multi-graph feature enhancement of the present invention is mainly characterized in that the processor is configured to execute computer executable instructions, which when executed by the processor, implement the steps of the method for implementing the encrypted application traffic identification process based on multi-graph feature enhancement described above.
The computer readable storage medium of the present invention is mainly characterized in that it has a computer program stored thereon, said computer program being executable by a processor to implement the steps of the method for implementing encryption application traffic identification processing based on multi-graph feature enhancement as described above.
In the specific embodiment of the invention, aiming at encryption application flow identification and classification, encryption flow multidimensional characterization starts, (1) a data packet graph construction method based on multi-type interaction information is provided to solve the problem that the traditional single session gray graph characterization is easy to attack; (2) The method for constructing the session flow graph based on the association relation of the flow sequences breaks through the limitation of single session flow characterization, and enriches the semantic characterization of each encryption session through the association relation among a plurality of encryption sessions; (3) And introducing a graph convolution neural network technology, and performing characterization learning on a data packet graph constructed based on a single session and a session flow graph constructed based on a plurality of sessions by fusing a multi-level graph neural network so as to realize high-precision identification and classification of encryption application flow.
The invention adopts the following technical scheme. Firstly, respectively aiming at data packet graph construction, a data packet graph construction method based on multi-type interaction information is provided, so that the characterization capability of a single session stream based on a data packet mode is improved; aiming at session flow graph construction, a session flow graph construction method based on a flow sequence association relationship is provided to solve the problem of session association information missing in encrypted flow representation, thereby improving session flow representation capability based on encrypted session context. And then, a hierarchical graph convolutional network structure is provided, and an encryption application flow rapid identification and classification model based on a data packet level and a session flow level is constructed. The method comprises the following steps:
step one, constructing a data packet diagram based on multi-type interaction information. The invention takes data packets in a single session stream as a main part, comprehensively considers the difference of the interaction characteristics of the data packets of the session stream (such as data packet load, data packet flow direction, packet sequence and the like) so as to construct a data packet interaction topological Graph (Package Graph), and mainly comprises the following steps: (1) encryption traffic pretreatment. Dividing the original flow according to the granularity of the session, and extracting basic information such as data packet quintuple, data packet load, data packet flow direction and the like in the session flow to construct a data packet diagram; and (2) constructing the data packet graph node. The invention defines each data packet in a single session stream as a node of a data packet graph, and takes the data packet load and the data packet stream as initial values of the node; and (3) constructing a data packet graph edge. The continuous data packet transmission in the same direction is called a cluster, and the data packet graph edges are divided into inner cluster edges and outer cluster edges according to time sequence interaction and access interaction information of the data packets in the session stream. The invention adopts a full connection mode for the inside and the outside of different clusters to acquire more abundant node relation information.
And step two, constructing a session flow graph based on the association relation of the flow sequences. The invention constructs a session flow Graph (Record Graph) based on the association relation of the flow sequence, so as to obtain richer semantic information among a plurality of encrypted sessions, and mainly comprises the following steps: (1) encryption traffic pretreatment. Processing the encrypted traffic on the session granularity, and reserving the encrypted session flow with complete session; (2) session flow graph node construction. The invention takes the complete session flow in the encrypted flow as the node of the session flow graph, and takes the representation of the data packet graph and the representation of the gray level graph of the original encrypted flow as the initial value of the node; (3) constructing a session flow graph edge. The destination IP address and destination port number of the two session flows are the same, and the two session flows establish a connection (access network service association). When the similarity of the two session flows is greater than the threshold, the greater the probability that the two session flows carry the same type of application, the two session flows establish a connection (packet sequence is similar).
And thirdly, encrypting application traffic classification based on the hierarchical graph convolutional network. The invention uses a graph roll-up network (Graph Convolutional Network, GCN) as the underlying network for graph feature extraction. Firstly, a graph rolling network is introduced into a data packet graph, and different state information of neighbor nodes is aggregated by continuously updating the state information of the data packet nodes, so that the characterization of a single session stream is enriched; secondly, in order to capture the characterization of different granularities such as data packet level, conversation flow level, etc., the data packet characteristic representation and the original encryption flow gray scale image characterization of the conversation flow are used as the initial characterization of conversation flow graph nodes, and a graph convolution network is further introduced, so that richer and more robust characteristic characterization is obtained; and then, after the network characterization layer is rolled by the multi-layer graph, carrying out linear change on output data by using a linear function, predicting application category distribution characteristics of different encrypted traffic by using the Softmax layer, and calculating the application category to which the encrypted traffic belongs by using probability distribution.
Referring to fig. 2, the encryption application traffic identification method of the present invention includes the following steps:
1. data preparation. All session stream data of one encrypted application traffic is fetched from the encrypted application traffic data set.
2. And constructing a data packet diagram. 1) The original flow is divided according to the granularity of the session, and basic information used for constructing a data packet diagram in the session flow is extracted, wherein the basic information comprises a data packet five-tuple (a transmission protocol, a source port number, a source IP address, a destination port number, a destination IP address), a data packet load, a data packet flow direction and the like. 2) Defining each data packet in a single session stream as a node in a data packet graph, V i,j (i=1, 2 …, n; j=1, 2, …, m) represents the j-th packet in the i-th session stream, where n represents the number of session streams in a certain encrypted segment of traffic and m represents the number of packets in a single session stream; 3) The packet flow is represented by a packet payload length symbol, i.e. the first packet transmission direction is defined as forward, e.g. (V 11 10), the direction of the following data packet is the same as that of the following data packet, and the following data packet is positive, otherwise, the following data packet is negative. 4) The continuous data packet transmission in the same direction is called a cluster, the data packet graph sides are divided into the inner cluster side and the outer cluster side according to time sequence interaction and access interaction information of the data packets in the session stream, and a full connection mode is adopted among different clusters.
3. The data packet map is rolled up into a network representation. In the data packet graph, a graph rolling network is introduced, and the state information of the data packet nodes is continuously updated according to a formula (1).
Wherein,is the adjacency matrix of the data packet map, D is +.>Is used for the degree matrix of the (c),is the output of the last convolution, and the initial feature matrix is set asV i,m An embedded representation of the packet based on the packet payload.
4. The original encrypted traffic gray map representation. The bytes with proper length (the first B bytes) are selected for representing construction, and ClientHello, serverHello, a certification message and the like are guaranteed to be contained in the first B bytes. The handshake information in session i is represented as follows:
RawBytes(i)=(b i,1 ,b i,2 ,…,b i,b ,…b i,B )……(2)
wherein b i,b The b-th byte, b, representing handshake information in the i-th session stream i,b ∈[0,255]。
5. The original encrypted traffic gray map rolled network representation. Converting the handshake information byte stream into a gray level map, mapping the original bytes to the characteristic corresponding with the fixed length by utilizing the embedding operation, and then processing the gray level map by using the one-dimensional convolution operation to acquire the context associated information of each byte, thereby acquiring richer semantic representation information. The gray map of the ith conversational flow is characterized as:
RawH i =Conv1D(embedding(RBytes(i)))……(3)
6. and constructing a session flow diagram. 1) The basic unit of session flow graph construction is session flow, so that encrypted traffic will be processed at session granularity, including the steps of splitting, filtering, etc. The original encrypted traffic is split into separate session flow units, i.e. the encrypted session division is performed according to the same five-tuple, where the source IP and Port can be interchanged with the target IP and Port. And deleting the unencrypted session stream and the incomplete session stream at the same time, thereby reducing the subsequent unnecessary calculation overhead. 2) Taking the complete session flow in the encrypted traffic as a node of a session flow graph, R i (i=1, 2, …, n) denotes the i-th session node, and n denotes the total number of sessions in a certain piece of encrypted traffic. 3) The definition of the edges in the session flow graph is based on the association relation of the session flow sequence, and is intended to comprise two parts of similar access network services and packet sequences. Wherein: the access network service association refers to whether two session flows share the same destination IP address and destination port number, and the specific formula is as followsAs follows, 1 denotes the establishment of a connection between two session flows.
Packet sequence similarity association refers to the establishment of a connection between two session flows when the two session flows are similar to each other by more than a threshold. The greater the similarity, the greater the likelihood that two session flows belong to carrying the same type of application. And calculating the similarity of the two session streams by adopting the Euclidean distance, wherein the calculation method is as follows.
7. A session flow graph convolution representation. A graph rolling network is also introduced into the session flow graph, except that the initial characterization of the session flow node in the session flow graph comprises the data packet characteristic representation and the original encrypted traffic gray scale graph characterization, so that the session flow node can capture characterization information with different granularities such as data packet level, session flow level and the like, and the expression capability is richer and more robust. The specific formula is as follows:
wherein,
8. encryption applies traffic classification calculations. The class probability distribution of the encrypted application traffic is predicted using Softmax, as shown in equation (8) below.
Wherein,the probability that the encrypted session stream i belongs to class C is represented, and C represents the class number of the encrypted traffic.
9. Encryption applies traffic classification predictions. The classification probability calculation results for the selected encryption application flow are respectively as follows: { Twitter:0.93; telegram:0.02; facebook:0.04; youTube:0.01}, then the classification result of the final model is: twitter.
The specific implementation manner of this embodiment may be referred to the related description in the foregoing embodiment, which is not repeated herein.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
It should be noted that in the description of the present invention, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present invention, unless otherwise indicated, the meaning of "plurality" means at least two.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and further implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution device. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or part of the steps carried out in the method of the above embodiments may be implemented by a program to instruct related hardware, and the corresponding program may be stored in a computer readable storage medium, where the program when executed includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented as software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The method, the device, the processor and the computer readable storage medium for realizing the encryption application flow identification processing based on multi-graph feature enhancement solve the problems that the existing encryption flow classification algorithm feature construction based on deep learning is easy to attack, and inter-session semantic association is ignored. Firstly, constructing a data packet-level encryption session topological graph based on the interactive characteristics of data packet load length, direction, packet sequence, cluster information and the like so as to fully mine session flow information in encryption application traffic. Further, the characterization limit of a single session flow is broken through, and an encryption application session flow graph based on the association relation of the flow sequences is constructed based on the fact that access network services among encryption sessions are identical and data packet sequences are similar. And finally, introducing a hierarchical graph rolling network, and performing characterization learning on a data packet graph constructed based on a single session and a session flow graph constructed based on a plurality of sessions, so that the problems of insufficient characterization of a single session flow and the like are solved, and high-precision identification and classification of encrypted flows are realized. The method creatively builds the data packet graph and the session flow graph, fully mines the session flow and the information between the session flows, and has certain innovativeness.
In this specification, the invention has been described with reference to specific embodiments thereof. It will be apparent, however, that various modifications and changes may be made without departing from the spirit and scope of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims (9)

1. A method for implementing encrypted application traffic identification processing based on multi-graph characterization enhancement, the method comprising the steps of:
(1) All session stream data of an encrypted application flow are taken out from the encrypted application flow data set;
(2) Constructing a data packet diagram;
(3) Introducing a graph rolling network into the data packet graph, and continuously updating the node state information of the data packet;
(4) Representing an original encrypted flow gray level diagram, and selecting bytes with proper length for representing construction;
(5) Representing an original encrypted traffic gray map rolling network;
(6) Constructing a session flow graph;
(7) Representing a session flow graph convolution, and introducing a graph convolution network into the session flow graph;
(8) Calculating encryption application traffic classification;
(9) And carrying out encryption application traffic classification prediction.
2. The method for implementing encrypted application traffic recognition processing based on multi-graph feature enhancement according to claim 1, wherein the step (2) specifically comprises the steps of:
(2.1) dividing the original traffic according to the granularity of the session, and extracting basic information for constructing a data packet graph from the session flow;
(2.2) defining each data packet in the single session stream as a node in the data packet map;
(2.3) defining the transmission direction of the first data packet as a forward direction, wherein the direction of the subsequent data packet is the same as that of the first data packet to be a positive value, and otherwise, the direction of the subsequent data packet is a negative value;
and (2.4) transmitting continuous data packets in the same direction as clusters, dividing the edges of the data packet into inner and outer clusters according to time sequence interaction and access interaction information of the data packets in the session stream, and adopting a full connection mode between different clusters.
3. The method for implementing encrypted application traffic recognition processing based on multi-graph feature enhancement according to claim 1, wherein the step (5) specifically comprises the steps of:
(5.1) converting the handshake information byte stream into a gray scale map, mapping the original bytes to fixed length features using an ebedding operation;
and (5.2) processing the gray level map by using one-dimensional convolution operation, and acquiring context associated information of each byte to obtain richer semantic representation information.
4. The method for implementing encrypted application traffic recognition processing based on multi-graph feature enhancement according to claim 1, wherein said step (6) specifically comprises the steps of:
(6.1) processing the encrypted traffic at the session granularity, dividing the encrypted session according to the same five-tuple, and deleting the unencrypted session stream and the incomplete session stream;
(6.2) taking the complete session flow in the encrypted traffic as a node of a session flow graph;
(6.3) defining edges in the session flow graph based on the visited network service and the packet sequence similarity.
5. The method for implementing encrypted application traffic identification processing based on multi-graph feature enhancement according to claim 1, wherein the updating of the packet node status information in the step (3) specifically includes:
updating the state information of the data packet node according to the following formula:
wherein,for the adjacency matrix of the data packet map, D is +.>Is set to the degree matrix of the initial feature matrixV i,m An embedded representation of the packet based on the packet payload.
6. The method for implementing encryption application traffic identification processing based on multi-graph feature enhancement according to claim 1, wherein the calculating encryption application traffic classification in the step (8) specifically includes:
the encryption application traffic classification is calculated according to the following formula:
wherein,the probability that the encrypted session stream i belongs to class C is represented, and C represents the class number of the encrypted traffic.
7. An apparatus for implementing encrypted application traffic identification processing based on multi-graph characterization enhancement, the apparatus comprising:
a processor configured to execute computer-executable instructions;
a memory storing one or more computer-executable instructions which, when executed by the processor, perform the steps of the method for implementing encrypted application traffic identification processing based on multi-graph feature enhancement of any one of claims 1 to 6.
8. A processor for implementing a multi-graph feature-based enhanced encryption application traffic identification process, wherein the processor is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the method of implementing a multi-graph feature-based encryption application traffic identification process of any one of claims 1 to 6.
9. A computer readable storage medium having stored thereon a computer program executable by a processor to perform the steps of the method of implementing encrypted application traffic identification processing based on multi-graph feature enhancement as claimed in any one of claims 1 to 6.
CN202311805721.5A 2023-12-26 2023-12-26 Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement Pending CN117633657A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311805721.5A CN117633657A (en) 2023-12-26 2023-12-26 Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311805721.5A CN117633657A (en) 2023-12-26 2023-12-26 Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement

Publications (1)

Publication Number Publication Date
CN117633657A true CN117633657A (en) 2024-03-01

Family

ID=90032279

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311805721.5A Pending CN117633657A (en) 2023-12-26 2023-12-26 Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement

Country Status (1)

Country Link
CN (1) CN117633657A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118101357A (en) * 2024-04-29 2024-05-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Network flow classification method combining data packet semantics

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118101357A (en) * 2024-04-29 2024-05-28 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Network flow classification method combining data packet semantics

Similar Documents

Publication Publication Date Title
WO2022041394A1 (en) Method and apparatus for identifying network encrypted traffic
CN108900432B (en) Content perception method based on network flow behavior
Zeng et al. DeepVCM: A deep learning based intrusion detection method in VANET
WO2018054342A1 (en) Method and system for classifying network data stream
Wei et al. ABL-TC: A lightweight design for network traffic classification empowered by deep learning
CN111464485A (en) Encrypted proxy flow detection method and device
Soleymanpour et al. CSCNN: cost-sensitive convolutional neural network for encrypted traffic classification
CN117633657A (en) Method, device, processor and computer readable storage medium for realizing encryption application flow identification processing based on multi-graph characterization enhancement
Divakaran et al. Slic: Self-learning intelligent classifier for network traffic
Cheng et al. Real-time encrypted traffic classification via lightweight neural networks
US12014277B2 (en) Physical layer authentication of electronic communication networks
CN113452676B (en) Detector distribution method and Internet of things detection system
CN111431819A (en) Network traffic classification method and device based on serialized protocol flow characteristics
CN112491894A (en) Internet of things network attack flow monitoring system based on space-time feature learning
CN113472751B (en) Encrypted flow identification method and device based on data packet header
CN111565156A (en) Method for identifying and classifying network traffic
Lin et al. A novel multimodal deep learning framework for encrypted traffic classification
CN112468324A (en) Graph convolution neural network-based encrypted traffic classification method and device
Tan et al. Recognizing the content types of network traffic based on a hybrid DNN-HMM model
CN114826776A (en) Weak supervision detection method and system for encrypted malicious traffic
CN114650229A (en) Network encryption traffic classification method and system based on three-layer model SFTF-L
Wang et al. A two-phase approach to fast and accurate classification of encrypted traffic
Huo et al. A novel approach for semi-supervised network traffic classification
Zeng et al. TEST: An end-to-end network traffic examination and identification framework based on spatio-temporal features extraction
CN114979017B (en) Deep learning protocol identification method and system based on original flow of industrial control system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination