CN110708215B - Deep packet inspection rule base generation method, device, network equipment and storage medium - Google Patents

Deep packet inspection rule base generation method, device, network equipment and storage medium Download PDF

Info

Publication number
CN110708215B
CN110708215B CN201910957075.1A CN201910957075A CN110708215B CN 110708215 B CN110708215 B CN 110708215B CN 201910957075 A CN201910957075 A CN 201910957075A CN 110708215 B CN110708215 B CN 110708215B
Authority
CN
China
Prior art keywords
rule
data packet
rule base
data
generated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910957075.1A
Other languages
Chinese (zh)
Other versions
CN110708215A (en
Inventor
石仟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Onething Technology Co Ltd
Original Assignee
Shenzhen Onething Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Onething Technology Co Ltd filed Critical Shenzhen Onething Technology Co Ltd
Priority to CN201910957075.1A priority Critical patent/CN110708215B/en
Publication of CN110708215A publication Critical patent/CN110708215A/en
Application granted granted Critical
Publication of CN110708215B publication Critical patent/CN110708215B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/028Capturing of monitoring data by filtering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/02Capturing of monitoring data
    • H04L43/026Capturing of monitoring data using flow identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention provides a deep packet inspection rule base generation method, which comprises the following steps: receiving a data packet and identifying a data stream to which the data packet belongs; judging whether the data packet is matched with rules in a rule base or not; extracting a feature code of the data packet when the data packet is not matched with the rules in the rule base; generating a rule according to the feature code; judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream; when the generated rule is determined to be valid, the rule base is updated. The invention also provides a deep packet inspection rule base generating device, network equipment and a storage medium. The method and the device can automatically generate the rules aiming at the new applied data packet to update the existing rule base in real time, thereby improving the recognition rate of the application.

Description

Deep packet inspection rule base generation method, device, network equipment and storage medium
Technical Field
The present invention relates to the field of data networks, and in particular, to a method, an apparatus, a network device, and a storage medium for generating a deep packet inspection rule base.
Background
Deep Packet Inspection (DPI) is a high-speed Inspection method for network data, and is mainly used for inspecting the content of the payload field of a network packet. The technology is widely used in Intrusion Prevention Systems (IPS) and Intrusion Detection Systems (IDS).
At present, the deep packet inspection technology is to manually identify the application, extract the feature codes and compile the feature codes to generate a deep packet inspection rule base. However, with the increase of applications, new applications will also appear at any time, so that the existing deep packet inspection rule base cannot accurately identify new applications.
Therefore, there is a need to provide a deep packet inspection rule base generation scheme that can update DPI rule features in time according to new applications.
Disclosure of Invention
The invention mainly aims to provide a deep packet inspection rule base generation method, a deep packet inspection rule base generation device, network equipment and a storage medium, and aims to solve the technical problem that an existing rule base cannot identify new applications in time.
To achieve the above object, a first aspect of the present invention provides a deep packet inspection rule base generating method, the method including:
receiving a data packet and identifying a data stream to which the data packet belongs;
judging whether the data packet is matched with rules in a rule base or not;
extracting a feature code of the data packet when the data packet is not matched with the rules in the rule base;
Generating a rule according to the feature code;
Judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream;
when the generated rule is determined to be valid, the rule base is updated.
According to an optional embodiment of the invention, the determining whether the generated rule is valid according to the generated rule and other data packets in the data stream comprises:
in the next round of DPI detection, matching other data packets in the data stream by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is larger than a preset matching rate threshold value or not;
When the matching passing rate is greater than or equal to the preset matching rate threshold value, determining that the generated rule is valid;
And when the matching passing rate is smaller than the preset matching rate threshold value, determining that the generated rule is invalid.
According to an alternative embodiment of the present invention, the updating the rule base when it is determined that the generated rule is valid includes:
Reporting the effective rule and the corresponding matching passing rate to a server;
Receiving a final rule base generated by the server according to all the matching passing rates;
Updating the rule base as the final rule base.
According to an alternative embodiment of the present invention, extracting the feature code of the data packet includes:
extracting header information of the data packet in each layer of protocol;
Acquiring a plurality of target information in each piece of head information;
and connecting the plurality of target information to obtain feature codes.
According to an optional embodiment of the invention, the extracting header information of the data packet in each layer of protocol includes: extracting the effective load of the data packet in an application layer protocol; the acquiring the plurality of target information in each of the header information includes: and acquiring a command code in the payload as target information or acquiring Type information in the payload as target information.
According to an optional embodiment of the invention, the generating rule according to the feature code comprises:
And carrying out regular matching on all characters in the feature codes according to the basic rule of the set regular expression to obtain a rule.
According to an alternative embodiment of the present invention, when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packet to a corresponding link according to the application of the data packet.
To achieve the above object, a second aspect of the present invention provides a deep packet inspection rule base generating apparatus, the apparatus comprising:
the receiving module is used for receiving the data packet and identifying the data flow to which the data packet belongs;
The matching module is used for judging whether the data packet is matched with the rule in the rule base or not;
the extraction module is used for extracting the feature codes of the data packet when the data packet is not matched with the rules in the rule base;
The generation module is used for generating rules according to the feature codes;
the judging module is used for judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream;
and the updating module is used for updating the rule base when the generated rule is determined to be valid.
In order to achieve the above object, a third aspect of the present invention provides a network device, including a memory and a processor, where the memory stores a download program of a deep packet inspection rule base generation method that can be executed on the processor, and the download program of the deep packet inspection rule base generation method implements the deep packet inspection rule base generation method when executed by the processor.
To achieve the above object, a fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a download program of a deep packet inspection rule base generation method executable by one or more processors to implement the deep packet inspection rule base generation method.
According to the deep packet inspection rule base generation method, device, network equipment and storage medium, aiming at each new application, the rule is automatically generated according to the applied data packet and the existing rule base is updated in real time, so that the identification rate of the application flow can be improved. And the generation mode of the rule is not delayed from the use of the application. When the network nodes are more, the automatically generated rules are more, the generated rules are calculated at the server to further determine the validity of the rules, and the cost of manual identification is reduced.
Drawings
FIG. 1 is a flowchart of a deep packet inspection rule base generation method according to a first embodiment of the present invention;
fig. 2 is a schematic functional block diagram of a deep packet inspection rule base generating device according to a second embodiment of the present invention;
Fig. 3 is a schematic diagram of an internal structure of a network device according to a third embodiment of the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first and second in the description and claims of the application and in the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments described herein may be implemented in other sequences than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the description of "first", "second", etc. in this disclosure is for descriptive purposes only and is not to be construed as indicating or implying a relative importance or implying an indication of the number of technical features being indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present invention.
Example 1
Referring to fig. 1, a flowchart of a deep packet inspection rule base generating method according to a first embodiment of the present invention is shown.
The deep packet inspection rule base generation method can be applied to network equipment, and the network equipment can comprise: switches, routers, firewall devices, or other network security devices, etc. The method for generating the deep packet inspection rule base specifically comprises the following steps, the sequence of the steps in the flow chart can be changed according to different requirements, and certain steps can be omitted.
S11, receiving the data packet and identifying the data stream to which the data packet belongs.
The Packet (Packet) is a transmission unit of a network layer in the entire TCP/IP communication protocol, and is also a minimum unit. Packets having the same quadruple (e.g., source address, source port, destination address, destination port) are one data stream. I.e. a plurality of data packets in one data stream. For example, there are 3 client packets and 40 server packets in a TCP data stream.
In an application scenario, a terminal may send a data packet to a server, and when receiving the data packet, a network device may match the data packet with a rule, and release the data packet to the server after the matching is successful.
After receiving the applied data packet, the network device may identify the data flow to which the data packet belongs by acquiring relevant information in the data packet, such as a source IP address, a destination IP address, a Host name, an IP protocol type (TCP/UDP/ICMP), a source port number and/or a destination port number range. The identification of the data stream to which the data packet belongs is prior art and the invention is not described in detail here. For example, in a Linux kernel, recognition of the data stream may be implemented using netfilter connection tracking functionality in the kernel.
S12, judging whether the data packet is matched with the rule in the rule base.
A rule base is usually stored in the network device in advance, so as to perform matching detection on the received data packet, thereby implementing the DPI function. The rule base may be an intrusion prevention system (Intrusion Prevention System, IPS) rule base or a uniform resource locator (Uniform Resource Locator, URL) classification rule base, or the like. The rule base currently stored by the network device is referred to as a current rule base. The current rule base is not a database, but a set of rules that match for different application data packets, a description of a condition. The current rule base comprises at least one rule and a rule tree constructed based on the at least one rule, wherein the at least one rule is a character string.
And the network equipment performs security control according to the matching result of the data packet and the rules in the rule base. And when the data packet is matched with the rule in the rule base, releasing the data packet to a server. When the data packet does not match a rule in the rule base, S13 may be executed, or the data packet may be discarded.
S13, extracting the feature codes of the data packet when the data packet is not matched with the rules in the rule base.
When a plurality of rules exist in the rule base, the data packet needs to be matched with each rule in the plurality of rules, if the data packet is successfully matched with a certain rule, the data packet is considered to be successfully matched with the rule base, and if the data packet is failed to be matched with all the rules, the data packet is considered to be failed to be matched with the rule base.
The matching process of the data packet and the rule is the prior art, and the present invention is not described in detail herein.
According to an alternative embodiment of the present invention, extracting the feature code of the data packet includes:
extracting header information of the data packet in each layer of protocol;
Acquiring a plurality of target information in each piece of head information;
and connecting the plurality of target information to obtain feature codes.
According to an optional embodiment of the invention, the extracting header information of the data packet in each layer of protocol includes: extracting the effective load of the data packet in an application layer protocol; the acquiring the plurality of target information in each of the header information includes: and acquiring a command code in the payload as target information or acquiring Type information in the payload as target information.
Each data packet corresponds to a header information on each layer of protocol, and according to the header information, the following contents can be extracted: remote IP, source and destination ports, three layer protocol numbers, layer 4 protocol numbers, domain name, hostname, HTTP related header information.
And for application layer protocols, are typically custom by the vendor or application. Generally, the application is in a TLV format, and the feature codes can be extracted according to the analysis rule of the payload (payload) information of the application layer protocol. When the payload information is a displayable character, the command code is generally used as a main part, and extraction is performed according to words. When the payload information is binary data, it is usually a header of a private protocol, and the corresponding Type information is extracted according to the correspondence of the position data.
S14, generating rules according to the feature codes.
The network device generates a rule base that the DPI system can use to match the data packet according to the extracted feature code. The rule base in each DPI system is in a format, some are matched in strings or regular strings, and some are matched by Berkeley packet filtering (Berkeley PACKET FILTER, BPF) rules generating bytecodes.
According to an optional embodiment of the invention, the generating rule according to the feature code comprises:
And carrying out regular matching on all characters in the feature codes according to the basic rule of the set regular expression to obtain a rule.
S15, judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream.
Since a rule generated by using a packet cannot represent that other packets in a data stream to which the packet belongs can pass through in a matching manner, it is also necessary to determine whether the generated rule is valid.
The judging whether the generated rule is valid according to the generated rule and other data packets in the data flow comprises the following steps:
in the next round of DPI detection, matching other data packets in the data stream by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is larger than a preset matching rate threshold value or not;
When the matching passing rate is greater than or equal to the preset matching rate threshold value, determining that the generated rule is valid;
And when the matching passing rate is smaller than the preset matching rate threshold value, determining that the generated rule is invalid.
In this alternative embodiment, the generated rules need to be written back into the DPI system and used for matching in the next round of DPI detection. If a majority of the data packets in the data stream can be correctly matched with the generated rule, then the rule generation of the application is considered successful and the generated rule is valid. Otherwise, most of the data packets in the data stream cannot be correctly matched with the generated rule, then the rule generation of the application is considered to fail, and the generated rule is invalid.
For example, assuming that there are 20 packets in a certain data stream, after a rule is generated according to the first packet, the remaining 19 packets are used to match the generated rule, and the matching passing rate is calculated. When the matching passing rate is higher than (greater than or equal to) a preset matching rate threshold, the generated rule is considered valid. When the matching passing rate is lower than (smaller than) the preset matching rate threshold value, the generated rule is considered to be invalid, and the feature code of the first data packet can be re-extracted and the rule can be generated according to the feature code of the first data packet. Alternatively, the feature code of the second data packet may be extracted and a rule may be generated according to the feature code of the second data packet, and then the first data packet may be matched.
S16, when the generated rule is determined to be effective, updating the rule base.
And the received data packet is not successfully matched with the rules in the current rule base, which indicates that the application is likely to be a new application, and the rule is generated according to the data packet of the new application and updated into the current rule base, so that the rules can be directly matched from the updated rule base when the data packet of the new application is received again later.
According to an alternative embodiment of the present invention, the updating the rule base when it is determined that the generated rule is valid includes:
Reporting the effective rule and the corresponding matching passing rate to a server;
Receiving a final rule base generated by the server according to all the matching passing rates;
Updating the rule base as the final rule base.
Although rules may be generated based on the relevant information extracted from the data packets, the rules may not be accurate enough and thus need to be uploaded to a server for further analysis and aggregation.
The server stores a rule list for recording newly generated rules received from all network devices and matching passing rates corresponding to the rules. And then calculating, and determining a final rule base according to a calculation result. Illustratively, assume that network device 1 sends rule 1 and the corresponding match pass rate 90% and rule 2 and the corresponding match pass rate 98% to the server. The network device 2 sends 96% of the matching passing rate of rule 1 and corresponding rule 1 and 97% of the matching passing rate of rule 2 and corresponding rule 2 to the server. The network device 3 transmits 92% of the matching passing rate of rule 1 and corresponding rule 1 and 99% of the matching passing rate of rule 2 and corresponding rule 2 to the server. Then the server calculates the average match pass rate for rule 1 to be 92.7% and rule 2 to be 98%. Since the average match passing rate of rule 1 is less than the predetermined match passing rate threshold (e.g., 95%), and the average match passing rate of rule 2 is greater than the predetermined match passing rate threshold (e.g., 95%), the server adds rule 2 to the rule base to update the rule base and issues the updated final rule base to the network device.
According to an alternative embodiment of the present invention, when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packet to a corresponding link according to the application of the data packet.
In this alternative embodiment, a data stream includes a plurality of data packets, and a data stream corresponds to an application. Therefore, when a certain data stream is successfully identified, only the application to which the data stream belongs needs to be judged, and then the subsequent data packets in the data stream do not need to be detected by DPI.
At present, the server and the export bandwidth resources of the internet users are limited, and the link stability and the real-time performance are not high, so that the users often rent a plurality of telecom or Unicom links with better quality for important services with high real-time performance and high stability, and rent common links for unimportant services, thereby improving the working efficiency and the network resource utilization rate. In this scenario, it is necessary to use a drainage function to direct traffic to the appropriate link according to the application type and user policy to achieve the goal.
The deep packet inspection rule base generation method provided by the embodiment of the invention judges whether the application of the data packet is a new application or not by receiving the data packet and matching the data packet with the rules in the existing rule base. And when the rule is not successfully matched with the rule in the current existing rule base, indicating that the application of the data packet is a new application, and generating the rule by extracting the feature code of the data packet so as to update the current rule base. And by identifying the data flow to which the data packet belongs, matching other data packets in the data flow with the generated rule to obtain a matching passing rate, and determining whether the generated rule is valid or not according to the matching passing rate. Therefore, the rule can be automatically generated and updated to the rule base aiming at each new application, and the recognition rate of the application flow can be improved. And the generation mode of the rule is not delayed from the use of the application. When the network nodes are more, the automatically generated rules are more, the generated rules are calculated at the server to further determine the validity of the rules, and the cost of manual identification is reduced.
Example two
Fig. 2 is a schematic functional block diagram of a deep packet inspection rule base generating device according to a second embodiment of the present invention.
In some embodiments, the deep packet inspection rule base generation device 20 operates in a resource server. The deep packet inspection rule base generation device 20 may include a plurality of functional modules composed of program code segments. Program code for each program segment in the deep packet inspection rule base generation apparatus 20 may be stored in a memory of a network device and executed by the at least one processor to perform all or part of the steps in the deep packet inspection rule base generation method (described in detail with reference to fig. 1).
In this embodiment, the deep packet inspection rule base generating device 20 may be divided into a plurality of functional modules according to the functions performed by the deep packet inspection rule base generating device. The functional module may include: the device comprises a receiving module 201, a matching module 202, an extracting module 203, a generating module 204, a judging module 205, an updating module 206, a determining module 207 and an allocating module 208. The module referred to in the present invention refers to a series of computer program segments capable of being executed by at least one processor and of performing a fixed function, stored in a memory. In the present embodiment, the functions of the respective modules will be described in detail in the following embodiments.
A receiving module 201, configured to receive a data packet and identify a data stream to which the data packet belongs.
The Packet (Packet) is a transmission unit of a network layer in the entire TCP/IP communication protocol, and is also a minimum unit. Packets having the same quadruple (e.g., source address, source port, destination address, destination port) are one data stream. I.e. a plurality of data packets in one data stream. For example, there are 3 client packets and 40 server packets in a TCP data stream.
In an application scenario, a terminal may send a data packet to a server, and when receiving the data packet, a network device may match the data packet with a rule, and release the data packet to the server after the matching is successful.
After receiving the applied data packet, the network device may identify the data flow to which the data packet belongs by acquiring relevant information in the data packet, such as a source IP address, a destination IP address, a Host name, an IP protocol type (TCP/UDP/ICMP), a source port number and/or a destination port number range. The identification of the data stream to which the data packet belongs is prior art and the invention is not described in detail here. For example, in a Linux kernel, recognition of the data stream may be implemented using netfilter connection tracking functionality in the kernel.
And the matching module 202 is configured to determine whether the data packet matches a rule in the rule base.
A rule base is usually stored in the network device in advance, so as to perform matching detection on the received data packet, thereby implementing the DPI function. The rule base may be an intrusion prevention system (Intrusion Prevention System, IPS) rule base or a uniform resource locator (Uniform Resource Locator, URL) classification rule base, or the like. The rule base currently stored by the network device is referred to as a current rule base. The current rule base is not a database, but a set of rules that match for different application data packets, a description of a condition. The current rule base comprises at least one rule and a rule tree constructed based on the at least one rule, wherein the at least one rule is a character string.
And the network equipment performs security control according to the matching result of the data packet and the rules in the rule base. And when the data packet is matched with the rule in the rule base, releasing the data packet to a server. The extraction module 203 may be executed or the data packet may be discarded when the data packet does not match a rule in the rule base.
The extracting module 203 is configured to extract a feature code of the data packet when the data packet does not match a rule in the rule base.
When a plurality of rules exist in the rule base, the data packet needs to be matched with each rule in the plurality of rules, if the data packet is successfully matched with a certain rule, the data packet is considered to be successfully matched with the rule base, and if the data packet is failed to be matched with all the rules, the data packet is considered to be failed to be matched with the rule base.
The matching process of the data packet and the rule is the prior art, and the present invention is not described in detail herein.
According to an alternative embodiment of the present invention, the extracting module 203 extracts the feature code of the data packet includes:
extracting header information of the data packet in each layer of protocol;
Acquiring a plurality of target information in each piece of head information;
and connecting the plurality of target information to obtain feature codes.
According to an optional embodiment of the invention, the extracting header information of the data packet in each layer of protocol includes: extracting the effective load of the data packet in an application layer protocol; the acquiring the plurality of target information in each of the header information includes: and acquiring a command code in the payload as target information or acquiring Type information in the payload as target information.
Each data packet corresponds to a header information on each layer of protocol, and according to the header information, the following contents can be extracted: remote IP, source and destination ports, three layer protocol numbers, layer 4 protocol numbers, domain name, hostname, HTTP related header information.
And for application layer protocols, are typically custom by the vendor or application. Generally, the application is in a TLV format, and the feature codes can be extracted according to the analysis rule of the payload (payload) information of the application layer protocol. When the payload information is a displayable character, the command code is generally used as a main part, and extraction is performed according to words. When the payload information is binary data, it is usually a header of a private protocol, and the corresponding Type information is extracted according to the correspondence of the position data.
A generating module 204, configured to generate a rule according to the feature code.
The network device generates a rule base that the DPI system can use to match the data packet according to the extracted feature code. The rule base in each DPI system is in a format, some are matched in strings or regular strings, and some are matched by Berkeley packet filtering (Berkeley PACKET FILTER, BPF) rules generating bytecodes.
According to an alternative embodiment of the present invention, the generating module 204 includes:
And carrying out regular matching on all characters in the feature codes according to the basic rule of the set regular expression to obtain a rule.
And the judging module 205 is configured to judge whether the generated rule is valid according to the generated rule and other data packets in the data stream.
Since a rule generated by using a packet cannot represent that other packets in a data stream to which the packet belongs can pass through in a matching manner, it is also necessary to determine whether the generated rule is valid.
The determining module 205 determines whether the generated rule is valid according to the generated rule and other data packets in the data stream, including:
in the next round of DPI detection, matching other data packets in the data stream by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is larger than a preset matching rate threshold value or not;
When the matching passing rate is greater than or equal to the preset matching rate threshold value, determining that the generated rule is valid;
And when the matching passing rate is smaller than the preset matching rate threshold value, determining that the generated rule is invalid.
In this alternative embodiment, the generated rules need to be written back into the DPI system and used for matching in the next round of DPI detection. If a majority of the data packets in the data stream can be correctly matched with the generated rule, then the rule generation of the application is considered successful and the generated rule is valid. Otherwise, most of the data packets in the data stream cannot be correctly matched with the generated rule, then the rule generation of the application is considered to fail, and the generated rule is invalid.
For example, assuming that there are 20 packets in a certain data stream, after a rule is generated according to the first packet, the remaining 19 packets are used to match the generated rule, and the matching passing rate is calculated. When the matching passing rate is higher than (greater than or equal to) a preset matching rate threshold, the generated rule is considered valid. When the matching passing rate is lower than (smaller than) the preset matching rate threshold value, the generated rule is considered to be invalid, and the feature code of the first data packet can be re-extracted and the rule can be generated according to the feature code of the first data packet. Alternatively, the feature code of the second data packet may be extracted and a rule may be generated according to the feature code of the second data packet, and then the first data packet may be matched.
An updating module 206, configured to update the rule base when the generated rule is determined to be valid.
And the received data packet is not successfully matched with the rules in the current rule base, which indicates that the application is likely to be a new application, and the rule is generated according to the data packet of the new application and updated into the current rule base, so that the rules can be directly matched from the updated rule base when the data packet of the new application is received again later.
According to an alternative embodiment of the present invention, the updating module 206, when determining that the generated rule is valid, updates the rule base includes:
Reporting the effective rule and the corresponding matching passing rate to a server;
Receiving a final rule base generated by the server according to all the matching passing rates;
Updating the rule base as the final rule base.
Although rules may be generated based on the relevant information extracted from the data packets, the rules may not be accurate enough and thus need to be uploaded to a server for further analysis and aggregation.
The server stores a rule list for recording newly generated rules received from all network devices and matching passing rates corresponding to the rules. And then calculating, and determining a final rule base according to a calculation result. Illustratively, assume that network device 1 sends rule 1 and the corresponding match pass rate 90% and rule 2 and the corresponding match pass rate 98% to the server. The network device 2 sends 96% of the matching passing rate of rule 1 and corresponding rule 1 and 97% of the matching passing rate of rule 2 and corresponding rule 2 to the server. The network device 3 transmits 92% of the matching passing rate of rule 1 and corresponding rule 1 and 99% of the matching passing rate of rule 2 and corresponding rule 2 to the server. Then the server calculates the average match pass rate for rule 1 to be 92.7% and rule 2 to be 98%. Since the average match passing rate of rule 1 is less than the predetermined match passing rate threshold (e.g., 95%), and the average match passing rate of rule 2 is greater than the predetermined match passing rate threshold (e.g., 95%), the server adds rule 2 to the rule base to update the rule base and issues the updated final rule base to the network device.
According to an alternative embodiment of the present invention, when it is determined that the data packet matches a rule in the rule base, the deep packet inspection rule base generating device 20 further includes:
A determining module 207, configured to determine an application of the data packet according to the matched rule;
and the distribution module 208 is configured to distribute the data packet to a corresponding link according to the application of the data packet.
In this alternative embodiment, a data stream includes a plurality of data packets, and a data stream corresponds to an application. Therefore, when a certain data stream is successfully identified, only the application to which the data stream belongs needs to be judged, and then the subsequent data packets in the data stream do not need to be detected by DPI.
At present, the server and the export bandwidth resources of the internet users are limited, and the link stability and the real-time performance are not high, so that the users often rent a plurality of telecom or Unicom links with better quality for important services with high real-time performance and high stability, and rent common links for unimportant services, thereby improving the working efficiency and the network resource utilization rate. In this scenario, it is necessary to use a drainage function to direct traffic to the appropriate link according to the application type and user policy to achieve the goal.
The deep packet inspection rule base generating device provided by the embodiment of the invention judges whether the application of the data packet is a new application or not by receiving the data packet and matching the data packet with the rules in the existing rule base. And when the rule is not successfully matched with the rule in the current existing rule base, indicating that the application of the data packet is a new application, and generating the rule by extracting the feature code of the data packet so as to update the current rule base. And by identifying the data flow to which the data packet belongs, matching other data packets in the data flow with the generated rule to obtain a matching passing rate, and determining whether the generated rule is valid or not according to the matching passing rate. Therefore, the rule can be automatically generated and updated to the rule base aiming at each new application, and the recognition rate of the application flow can be improved. And the generation mode of the rule is not delayed from the use of the application. When the network nodes are more, the automatically generated rules are more, the generated rules are calculated at the server to further determine the validity of the rules, and the cost of manual identification is reduced.
Example III
Fig. 3 is a schematic diagram of an internal structure of a network device according to an embodiment of the present invention.
In this embodiment, the network device 3 may be a client, a resource server, or other electronic devices.
The network device 3 may comprise a memory 31, a processor 32 and a bus 33.
The memory 31 includes at least one type of readable storage medium including flash memory, a hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 31 may in some embodiments be an internal storage unit of the network device 3, such as a hard disk of the network device 3. The memory 31 may in other embodiments also be an external storage device of the network device 3, such as a plug-in hard disk provided on the network device 3, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like. Further, the memory 31 may also comprise both an internal storage unit and an external storage device of the network device 3. The memory 31 may be used not only for storing application programs and various types of data installed in the network device 3, such as codes and the like of the deep packet inspection rule base generating apparatus 20 and respective modules, but also for temporarily storing data that has been output or is to be output.
Processor 32 may in some embodiments be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip for executing program code or processing data stored in memory 31.
The bus 33 may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, or the like. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in fig. 3, but not only one bus or one type of bus.
Further, the network device 3 may further comprise a network interface, which may optionally comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the network device 3 and other network devices.
Optionally, the network device 3 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying messages processed in the network device and for displaying a visual user interface.
Fig. 3 shows only the network device 3 with components 31-33, it being understood by a person skilled in the art that the configuration shown in fig. 3 does not constitute a limitation of the network device 3, either a bus-type configuration or a star-type configuration, and that the network device 3 may also comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components. Other electronic products that may be present in the present invention or may later come into existence, as applicable, are also within the scope of the present invention and are incorporated herein by reference.
In the above embodiments, it may be implemented in whole or in part by an application, hardware, firmware, or any combination thereof. When implemented using an application, may be implemented in whole or in part in the form of a computer program product.
The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk (Solid STATE DISK, SSD)), etc.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.
In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in application program functional units.
The integrated units, if implemented in the form of application functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution that contributes to the prior art, or in the form of an application program product, which is stored in a storage medium, comprising several instructions for causing a network device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a hard disk, a Read-Only Memory (ROM), a magnetic disk, an optical disk, or the like, which can store program codes.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, article, or method that comprises the element.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (8)

1. A deep packet inspection rule base generation method, applied to a network device, the method comprising:
receiving a data packet and identifying a data stream to which the data packet belongs;
judging whether the data packet is matched with rules in a rule base or not;
extracting a feature code of the data packet when the data packet is not matched with the rules in the rule base;
Generating a rule according to the feature code;
Judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream;
updating the rule base when the generated rule is determined to be valid;
the judging whether the generated rule is valid according to the generated rule and other data packets in the data flow comprises the following steps:
in the next round of DPI detection, matching other data packets in the data stream by using the generated rule and calculating a matching passing rate;
judging whether the matching passing rate is larger than a preset matching rate threshold value or not;
when the matching passing rate is greater than or equal to the preset matching rate threshold value, determining that the generated rule is valid; or (b)
When the matching passing rate is smaller than the preset matching rate threshold value, determining that the generated rule is invalid;
The updating the rule base when the generated rule is determined to be valid comprises:
Reporting the effective rules and the corresponding matching passing rates to a server, so that the server calculates average matching passing rates according to the matching passing rates corresponding to the effective rules of all network devices;
Receiving a final rule base generated by the server according to the fact that the average matching passing rate is larger than a preset matching passing rate threshold;
Updating the rule base as the final rule base.
2. The method of claim 1, wherein extracting the signature of the data packet comprises:
extracting header information of the data packet in each layer of protocol;
Acquiring a plurality of target information in each piece of head information;
and connecting the plurality of target information to obtain feature codes.
3. The method of claim 2, wherein,
The extracting header information of the data packet in each layer of protocol comprises the following steps: extracting the effective load of the data packet in an application layer protocol;
The acquiring the plurality of target information in each of the header information includes: and acquiring a command code in the payload as target information or acquiring Type information in the payload as target information.
4. A method according to any one of claims 1 to 3, wherein said generating a rule according to said signature comprises:
And carrying out regular matching on all characters in the feature codes according to the basic rule of the set regular expression to obtain a rule.
5. A method according to any one of claims 1 to 3, wherein when it is determined that the data packet matches a rule in the rule base, the method further comprises:
determining the application of the data packet according to the matched rule;
and distributing the data packet to a corresponding link according to the application of the data packet.
6. A deep packet inspection rule base generating apparatus, wherein the deep packet inspection rule base generating method of any one of claims 1 to 5 is applied, the apparatus comprising:
the receiving module is used for receiving the data packet and identifying the data flow to which the data packet belongs;
The matching module is used for judging whether the data packet is matched with the rule in the rule base or not;
the extraction module is used for extracting the feature codes of the data packet when the data packet is not matched with the rules in the rule base;
The generation module is used for generating rules according to the feature codes;
the judging module is used for judging whether the generated rule is valid or not according to the generated rule and other data packets in the data stream;
and the updating module is used for updating the rule base when the generated rule is determined to be valid.
7. A network device, characterized in that the network device comprises a memory and a processor, the memory storing a download program of a deep packet inspection rule base generation method executable on the processor, the download program of the deep packet inspection rule base generation method implementing the deep packet inspection rule base generation method according to any one of claims 1 to 5 when executed by the processor.
8. A computer-readable storage medium, wherein a download program of a deep packet inspection rule base generation method is stored on the computer-readable storage medium, the download program of the deep packet inspection rule base generation method being executable by one or more processors to implement the deep packet inspection rule base generation method of any one of claims 1 to 5.
CN201910957075.1A 2019-10-10 2019-10-10 Deep packet inspection rule base generation method, device, network equipment and storage medium Active CN110708215B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910957075.1A CN110708215B (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method, device, network equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910957075.1A CN110708215B (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method, device, network equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110708215A CN110708215A (en) 2020-01-17
CN110708215B true CN110708215B (en) 2024-06-14

Family

ID=69199025

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910957075.1A Active CN110708215B (en) 2019-10-10 2019-10-10 Deep packet inspection rule base generation method, device, network equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110708215B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113271263B (en) * 2020-02-17 2023-01-06 华为技术服务有限公司 Data processing method and equipment thereof
CN111371649B (en) * 2020-03-03 2021-11-30 恒为科技(上海)股份有限公司 Deep packet detection method and device
CN111580931A (en) * 2020-05-10 2020-08-25 江苏省互联网行业管理服务中心 Matching rule engine supporting combined expression of multiple protocol variables
CN111553332B (en) * 2020-07-10 2020-10-30 杭州海康威视数字技术股份有限公司 Intrusion detection rule generation method and device and electronic equipment
CN112583832A (en) * 2020-12-14 2021-03-30 北京鼎普科技股份有限公司 DPI-based application layer protocol identification method and system
CN112835645B (en) * 2021-02-05 2022-09-30 杭州迪普科技股份有限公司 Rule configuration method and device
CN113890835A (en) * 2021-09-29 2022-01-04 杭州迪普科技股份有限公司 Method and device for processing DPI application test message
CN113905411B (en) * 2021-10-28 2023-05-02 中国联合网络通信集团有限公司 Detection method, device, equipment and storage medium for deep packet inspection identification rule
CN114826956B (en) * 2022-03-30 2023-05-26 杭州迪普科技股份有限公司 Automatic DPI policy library file generation method and device for DPI test equipment
CN115334003B (en) * 2022-08-10 2023-07-21 上海欣诺通信技术股份有限公司 Data stream processing method and system based on convergence and distribution equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN108289093A (en) * 2017-12-29 2018-07-17 北京拓明科技有限公司 The construction method and structure system in App application condition codes library

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101771627B (en) * 2009-01-05 2015-04-08 武汉邮电科学研究院 Equipment and method for analyzing and controlling node real-time deep packet on internet
CN102724317B (en) * 2012-06-21 2016-05-25 华为技术有限公司 A kind of network traffic data sorting technique and device
CN104243237B (en) * 2014-09-17 2017-05-17 新华三技术有限公司 P2P flow detection method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106301825A (en) * 2015-05-18 2017-01-04 中兴通讯股份有限公司 The generation method and device of DPI rule
CN108289093A (en) * 2017-12-29 2018-07-17 北京拓明科技有限公司 The construction method and structure system in App application condition codes library

Also Published As

Publication number Publication date
CN110708215A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
CN110708215B (en) Deep packet inspection rule base generation method, device, network equipment and storage medium
CN109951500B (en) Network attack detection method and device
US10277614B2 (en) Information processing apparatus, method for determining activity and computer-readable medium
US10104124B2 (en) Analysis rule adjustment device, analysis rule adjustment system, analysis rule adjustment method, and analysis rule adjustment program
US9369435B2 (en) Method for providing authoritative application-based routing and an improved application firewall
US10257213B2 (en) Extraction criterion determination method, communication monitoring system, extraction criterion determination apparatus and extraction criterion determination program
CN112217771B (en) Data forwarding method and data forwarding device based on tenant information
CN110519265B (en) Method and device for defending attack
US20120173712A1 (en) Method and device for identifying p2p application connections
US10158733B2 (en) Automated DPI process
EP3242240B1 (en) Malicious communication pattern extraction device, malicious communication pattern extraction system, malicious communication pattern extraction method and malicious communication pattern extraction program
CN105959290A (en) Detection method and device of attack message
CN111865996A (en) Data detection method and device and electronic equipment
US20240146753A1 (en) Automated identification of false positives in dns tunneling detectors
CN105100246A (en) Network flow management and control method based on downloaded resource name
KR101087291B1 (en) A method for identifying whole terminals using internet and a system thereof
CN112491836B (en) Communication system, method, device and electronic equipment
CN115826444A (en) Security access control method, system, device and equipment based on DNS analysis
CN106850349B (en) Feature information extraction method and device
CN103036895B (en) A kind of status tracking method and system
CN116723020A (en) Network service simulation method and device, electronic equipment and storage medium
CN110708317B (en) Data packet matching method, device, network equipment and storage medium
CN108040124B (en) Method and device for controlling mobile terminal application based on DNS-Over-HTTP protocol
WO2020157561A1 (en) Port scan detection
CN104079493A (en) Flow recognition method and equipment and management and control method and equipment based on names of downloaded resources

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant