CN115379026A - Method, device, equipment and storage medium for identifying message header field - Google Patents

Method, device, equipment and storage medium for identifying message header field Download PDF

Info

Publication number
CN115379026A
CN115379026A CN202210412422.4A CN202210412422A CN115379026A CN 115379026 A CN115379026 A CN 115379026A CN 202210412422 A CN202210412422 A CN 202210412422A CN 115379026 A CN115379026 A CN 115379026A
Authority
CN
China
Prior art keywords
header field
information
field
header
field information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210412422.4A
Other languages
Chinese (zh)
Other versions
CN115379026B (en
Inventor
刘科栋
曲德帅
徐太忠
王大伟
李扬曦
李舒
杨威
刘庆云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN202210412422.4A priority Critical patent/CN115379026B/en
Publication of CN115379026A publication Critical patent/CN115379026A/en
Application granted granted Critical
Publication of CN115379026B publication Critical patent/CN115379026B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a method, a device, equipment and a storage medium for identifying a message header field, which relate to the technical field of network security, and the method for identifying the message header field comprises the following steps: the method comprises the steps of obtaining message information to be identified, extracting header field information from the message information, carrying out positioning processing on the header field information, determining header field tree nodes corresponding to the header field information, carrying out matching identification processing on the header field information based on the header field tree nodes, and obtaining header field identification results corresponding to the message information, so that the problem caused by the fact that HTTP protocol header field identification is carried out only in a character string matching mode in the prior art is solved, rapid identification of protocol headers is achieved, and protocol header field identification efficiency is improved.

Description

Method, device, equipment and storage medium for identifying message header field
Technical Field
The present application relates to the field of network security technologies, and in particular, to a method, an apparatus, a device, and a storage medium for identifying a packet header field.
Background
The HyperText Transfer Protocol (HTTP) is a most widely applied network Protocol on the internet, and it is a key basis of a Deep Packet Inspection (DPI) system to quickly identify the HTTP.
The existing HTTP protocol header field identification method generally adopts a character string matching method, for example, character string matching is performed on an identified HTTP protocol header field and a preset header field one by one to obtain a character string matching result, and then an identification result of the HTTP protocol header field is determined based on the character string matching result to realize identification of the HTTP protocol header field. However, the existing HTTP protocol header field identification method simply adopts a character string matching method, which results in low HTTP protocol header field identification performance and greatly affects the processing performance of the DPI system.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, the present application provides a method, an apparatus, a device, and a storage medium for identifying a message header field.
In a first aspect, the present application provides a method for identifying a header field of a packet, including:
acquiring message information to be identified;
extracting header field information from the message information;
positioning processing is carried out on the header field information, and header field tree nodes corresponding to the header field information are determined;
and performing matching identification processing on the header field information based on the header field tree node to obtain a header identification result corresponding to the message information.
Optionally, the performing, for the header field information, a positioning process to determine a header field tree node corresponding to the header field information includes:
determining a header field type and a field length corresponding to the header field information;
determining a target header field tree corresponding to the header field information based on the header category;
and searching the head field tree node from the target head field tree based on the field length.
Optionally, the performing, on the basis of the head domain field tree node, matching and identifying the head domain field information to obtain a head domain identification result corresponding to the message information includes:
performing character string matching identification based on the head field tree nodes and the head field information to obtain a character string matching identification result;
if the character string matching identification result is a matching identification success result, taking a head domain identification success result as the head domain identification result;
and if the character string matching identification result is a matching identification failure result, performing hash identification on the header field information to obtain the header identification result.
Optionally, the performing character string matching identification based on the head field tree node and the head field information to obtain a character string matching identification result includes:
judging whether a preset character string in the head field tree node is matched with a character string in the head field information;
if the character string of the head field tree node is matched with the character string in the head field information, generating a matching identification success result;
and if the character string of the head field tree node is not matched with the character string in the head field information, generating a matching identification failure result.
Optionally, the performing hash identification on the header field information to obtain the header field identification result includes:
acquiring a preset hash table;
judging whether hash head field information in the hash table is matched with the head field information;
if the hash header field information is matched with the header field information, taking a successful hash header field identification result as the header field identification result;
and if the hash head field information is not matched with the head field information, taking a hash head field identification failure result as the head field identification result.
Optionally, the extracting header field information to be identified from the message information includes:
extracting an end identifier from the message information;
determining header field row data based on the end identifier;
and extracting header field information from the header field line data.
Optionally, before the obtaining the preset hash table, the method further includes:
acquiring system configuration file information;
reading protocol header field information from the system configuration file information, wherein the protocol header field information comprises the field information of an unusual header field and the field information of a custom header field;
and generating the hash table based on the non-use header field information and the custom header field information.
In a second aspect, the present application provides an apparatus for identifying a header field of a packet, including:
the acquisition module is used for acquiring message information to be identified;
an extraction module, configured to extract header field information from the message information;
a positioning processing module, configured to perform positioning processing on the header field information, and determine a header field tree node corresponding to the header field information;
and the matching identification processing module is used for carrying out matching identification processing on the header field information based on the header field tree node to obtain a header field identification result corresponding to the message information.
In a third aspect, the present application provides an apparatus for identifying a header field of a packet, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
a processor, configured to implement the steps of the method for identifying a header field of a packet according to any embodiment of the first aspect when executing a program stored in a memory.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method for identifying a header field of a packet according to any one of the embodiments of the first aspect.
In summary, according to the present application, the message information to be identified is obtained, the header field information is extracted from the message information, the header field information is located, the header field tree node corresponding to the header field information is determined, and the header field information is subjected to matching identification processing based on the header field tree node, so as to obtain the header field identification result corresponding to the message information, thereby solving the problem in the prior art that the HTTP protocol header identification is performed only by adopting a character string matching method, realizing the rapid identification of the protocol header, and improving the protocol header identification efficiency.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart illustrating steps of a method for identifying a header field of a packet according to an embodiment of the present application;
fig. 2 is a flowchart illustrating steps of a method for identifying a header field of a packet according to an alternative embodiment of the present application;
FIG. 3 is a diagram of a header field tree provided by the present application;
FIG. 4 is a hash table diagram provided herein;
fig. 5 is a block diagram of a structure of an apparatus for identifying a header field of a packet according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an identification device for a message header field according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In the specific implementation, the DPI system can provide a basis for traffic analysis, network planning, user behavior analysis, and other services by identifying and analyzing network traffic, and implement fine management of network service application, where quick identification of the header field of the HTTP network protocol is a key basis of the DPI system. Most existing HTTP protocol header field identification methods simply adopt a character string matching mode to realize identification of HTTP protocol header fields, however, the problem of low protocol identification performance exists due to the fact that the character string matching mode is simply adopted, identification of HTTP protocol custom header fields cannot be supported, or corresponding processing flows need to be added every time identification of one header field is added.
One of the core concepts of the embodiments of the present application is to provide a method for identifying a header field of a packet, which combines three specific implementation schemes, i.e., identification based on a header field tree, string matching identification, and Hash (Hash) fast identification, to achieve the purpose of efficiently identifying a header of an HTTP protocol in a high-speed network traffic processing process, further improve the processing performance of network traffic, achieve the purpose of improving the speed of identifying the header field of the HTTP protocol, improve the performance of analyzing the HTTP protocol, and have a great practical value in improving the processing performance of a DPI system, thereby improving the processing performance of the DPI system
For the purpose of facilitating understanding of the embodiments of the present application, the following description will be further explained with reference to the accompanying drawings and specific embodiments, which are not intended to limit the embodiments of the present application.
Fig. 1 is a flowchart illustrating a method for identifying a header field of a packet according to an embodiment of the present application. As shown in fig. 1, the method for identifying a header field of a packet provided by the present application may specifically include the following steps:
and step 110, acquiring message information to be identified.
Specifically, in the embodiment of the present application, the acquired message information that needs to be subjected to header field identification may be used as the message information to be identified, where the message information may be an HTTP protocol message, and certainly may also be other messages that need to be subjected to header field identification, and the embodiment of the present application does not specifically limit the message information.
Step 120, extracting header field information from the message information.
Specifically, after the message information to be identified is obtained, header field information may be extracted from the message information, where the header type may include a request header, a response header, a general header, and the like, and the present example does not specifically limit the header type, each header type may include one or more header field information, for example, the request header may include header field information such as authorization field information and proxy-authorization field information, the response header may include header field information such as server field information and last-modified field information, and the general header may include via field information, host field information, cookie field information, and the like.
In a specific implementation, the message information to be identified may include one or more header fields, and each header field may include one or more header field information, that is, the number of header field information extracted from the message information may be multiple, and in order to ensure that header field identification can be performed on each header field information included in the message information, in subsequent processing, each header field information may be identified respectively to obtain an identification result corresponding to each header field information, and then a header field identification result corresponding to the message information is determined according to the identification result corresponding to each header field information.
In a specific implementation, in a case that the message information is an HTTP protocol message, in the embodiment of the present application, by using an HTTP measurement method, the occurrence frequency of a header field corresponding to each piece of header field information in the received HTTP protocol message is counted, so that the use frequency of each header field in the HTTP protocol message is determined according to the occurrence frequency of each header field, and then the header fields included in the HTTP protocol message may be divided into a common header field, an unusual header field, and a custom header field.
For example, in the HTTP protocol message, the common header field of the request header field may include header fields such as an authorization field and a proxy-authorization field, the common header field of the response header field may include header fields such as a server field and a last-modified field, and the common header field of the common header field may include a via field, a host field, an etag field, a date field, a cookie field, a pragma field, a repeater field, a trailer field, an expires field, a charset field, a location field, a connection field, a user-agent field, a content-type field, a content-range field, a content-length field, a transfer-length field, an x-flash-version field, a content-location field, a content-coding field, a content-transmission-length field, a content-distribution field, and a code-location field. Furthermore, the unused header field may include header fields such as a date field, an upgrade field, a cache-control field, a connection field, and an accept-field, where the accept-field may represent a header field beginning with "accept", such as an accept field, an accept-charset field, an accept-encoding field, and an accept-language field, and the like, and this example is not particularly limited in this respect.
Step 130, positioning processing is performed on the header field information, and a header field tree node corresponding to the header field information is determined.
Specifically, after determining the header field information, the embodiment of the present application may perform positioning processing on the header field information, and determine a header field tree node corresponding to the header field information. Specifically, the header field type corresponding to the header field information may be determined according to the header field information, then the header field tree corresponding to the header field type may be determined according to the header field type, and then the corresponding node may be searched in the header field tree based on the header field information, so as to determine the header field tree node corresponding to the header field information.
For example, in a case that the header field information is authorization field information, the header type corresponding to the authorization field information is a request header field, and thus, a request header field tree may be determined based on the request header field, and then a header field tree node corresponding to the authorization field information may be searched in the request header field tree based on the authorization field information. Specifically, the header field length corresponding to the authorization field information may be obtained, where the field length may be the length of the character string "authorization" corresponding to the header field information, that is, the header field length information corresponding to the authorization field information is 13, and then based on the header field length information 13, a header field tree node with a header field length of 13 may be searched in the header field tree, and the header field tree node is used as the header field tree node corresponding to the authorization field information.
In an optional embodiment, a header field tree corresponding to each header type may be generated in advance for each header type, and then, based on the header fields included in the header type, a header field tree node corresponding to the header field may be generated. For example, in the case that the header field is a request header field and the request header field includes an authorization field and a proxy-authorization field, a request header field tree may be generated based on the request header field, and then a header field tree node corresponding to the authorization field may be generated based on the authorization field and a field length 13 corresponding to the authorization field, so as to determine a corresponding header field tree node based on header field information.
Further, in each header field included in the same type of header field, there may be a header field with the same header field length, and in order to improve the identification rate of the header field information, when a header field tree node is generated based on the header field lengths corresponding to the header field and the header field, the header field tree nodes with the same header field length may be gathered together to jointly generate the header field tree node, and the header field tree node may include a character string corresponding to the header field with the same header field length. For example, in the case where the header field type is a common header field including a host field, an etag field, and a date field, the header field lengths of the host field, the etag field, and the date field may each be 4. Based on the host field, the etag field, the date field, and the header field length 4 with the same three fields, a header field tree node may be generated, where the header field tree node may include the character string "host" corresponding to the host field, the character string "etag" corresponding to the etag field, and the character string "date" corresponding to the date field, and in subsequent processing, the character strings corresponding to the header fields included in the header field tree node may be matched with the character strings in the header field information, so as to realize quick identification of the header field information, that is, step 140 is executed.
And 140, performing matching identification processing on the header field information based on the header field tree node to obtain a header identification result corresponding to the message information.
Specifically, the header field recognition result may be divided into a header field recognition success result and a header field recognition result, which is not limited in this embodiment of the present application. Specifically, after determining a head field tree node corresponding to head field information, the embodiment of the present application may perform matching identification on a character string corresponding to each head field included in the head field tree node and a character string in the head field information to determine whether a head field matching the head field information exists in the head field tree node, and if a head field matching the head field information exists in the head field tree node, may determine that a matching identification result is a successful matching identification result; if the head field matched with the head field information does not exist in the head field tree node, the matching identification result can be determined as a matching identification failure result. And then, the header field identification result corresponding to the message information can be judged based on the matching identification result. The header field identification result corresponding to the message information may be determined as the header field identification success result, and the header field matched with the header field information may be used as the header field identification result.
For example, when it is determined that the head field tree node corresponding to the head field information includes the character strings "host", "etag", and "date", the head field tree node may include each character string to perform matching recognition with the character string in the head field information, if there is a character string matching the head field information in the character string included in the head field tree node, a matching recognition success result may be determined, and if the head field information is the host field information, the character string included in the head field tree node may be subjected to matching recognition with the character string in the head field information, it may be determined that the "host" character string in the head field tree node matches the "host" character string in the host field information, a matching recognition success result may be obtained, and the head field corresponding to the head field information may be determined as the host field.
In a specific implementation, when performing matching identification processing on header field information based on a header field tree node, in order to improve matching speed, a character string with high occurrence frequency in an HTTP protocol message may be preferentially matched. For example, the frequency of occurrence of the header field in the HTTP protocol message may be counted by an HTTP traffic measurement method, and the character strings corresponding to the header field with a higher frequency of occurrence may be preferentially matched, so as to improve the message header field recognition speed.
In the actual processing, the header field information contained in the message information may include custom header field information and/or unusual header field information, and at this time, the header field tree node may not include a header field matched with the header field information, so as to realize the identification of the custom header field and the unusual header field and improve the accuracy of the identification of the header field. The hash table may be generated in advance based on the custom header field information and the non-use header field information, and when it is determined that the header field matching the header field information does not exist in the header field tree node, hash lookup may be performed on the header field information to determine whether the header field matching the header field information exists in the hash table, that is, the header field matching the header field information is searched in the hash table to obtain a hash lookup result. If the hash table has the header field matched with the header field information, determining that the hash search result is a successful hash search result, determining a successful header identification result corresponding to the message information based on the successful search result, and determining the header field matched with the header field information in the hash table as the header field corresponding to the header field information; if the hash table does not have the header field matched with the header field information, the hash search result can be determined as a search failure result, and the header identification failure result corresponding to the message information can be determined based on the search failure result. By combining header field tree recognition, character string recognition and Hash recognition, the HTTP header field can be quickly recognized in high-speed real-time network flow, and meanwhile, the recognition of the custom header field and the non-use header field can be realized, and the accuracy of header field recognition is improved.
Therefore, the embodiment of the application obtains the message information to be identified, extracts the header field information from the message information, performs positioning processing on the header field information, determines the header field tree node corresponding to the header field information, and performs matching identification processing on the header field information based on the header field tree node to obtain the header field identification result corresponding to the message information, so that the problem caused by the fact that HTTP protocol header identification is performed only in a character string matching manner in the prior art is solved, rapid identification of the protocol header is realized, and the protocol header identification efficiency is improved.
Referring to fig. 2, a schematic flowchart illustrating steps of a method for identifying a header field of a packet according to an alternative embodiment of the present application is shown. The method for identifying a message header field may specifically include the following steps:
step 210, obtaining message information to be identified.
Step 220, extracting header field information from the message information.
In the specific implementation, under the condition that the message information is an HTTP protocol message, according to the HTTP protocol standard, the content format of the header field of the HTTP protocol is a key (key): value (value), and each header field is terminated with a carriage return line change symbol (\\ \ r \ n), so that the acquisition of the complete line data of the HTTP protocol can be realized by identifying the carriage return line change symbol, and further the header field key is extracted from the complete line data to be used as the header field information. Further, the extracting header field information from the message information in the embodiment of the present application may specifically include the steps of: extracting an end identifier from the message information; determining header field row data based on the end identifier; and extracting header field information from the header field line data. After the end identifier is determined, the start position of each header field in the HTTP message may be determined by the end identifier, so that complete line data corresponding to one or more header fields may be extracted from the HTTP message information, and then, for the complete line data corresponding to each header field, a header field key is extracted to serve as header field information corresponding to the header field, so that a header field tree node may be determined for the header field information in the following.
Step 230, determining the header field type and the field length corresponding to the header field information.
Specifically, the header field type may be divided into a request header field, a response header field, a general header field, and the like, which is not specifically limited in this embodiment of the application, and the field length may be the length of a character string corresponding to the header field information. For example, if the header field information is the authorization field information, the type of the header field corresponding to the authorization field information may be the request header field, the string corresponding to the authorization field information may be "authorization", and the length of the string corresponding to the authorization may be 13. For another example, in the case that the header field information is the server field information, the header field type corresponding to the server field information may be the response header field, the character string corresponding to the server field information may be "server", and the length of the character string corresponding to the server may be 6. For another example, in the case that the header field information is via field information, the header type corresponding to the via field information may be a response header, the character string corresponding to the via field information may be "via", and the character string length corresponding to the via may be 3.
Step 240, determining a target header field tree corresponding to the header field information based on the header category.
Specifically, after determining the type of the header field, the embodiment of the present application may determine, based on the type of the header field, a header field tree corresponding to the header field information, so as to serve as the target header field tree. For example, referring to fig. 3, in the case that the header type is a request header, a request header field tree corresponding to the request header may be determined, and the request header field tree may be used as a target header field tree corresponding to the header field information; under the condition that the type of the header field is the response header field, determining a response header field tree corresponding to the response header field, and using the response header field tree as a target header field tree corresponding to the header field information; when the type of the header field is a general header field, a general header field tree corresponding to the general header field may be determined, and the general header field tree may be used as a target header field tree corresponding to the header field information.
As an example of the present application, when the header field information is the authorization field information, the header type corresponding to the authorization field information is a request header, a header field tree corresponding to the request header may be a request header field tree, and the request header field tree may be a target header field tree corresponding to the authorization field information.
In an alternative embodiment, a header field tree corresponding to each header category may be constructed in advance based on the header category. For example, referring to fig. 3, when the header type is a request header, a request header field tree corresponding to the request header may be constructed, when the header type is a response header, a response header field tree corresponding to the response header may be constructed, and when the header type is a common header, a common header field tree corresponding to the common header may be constructed.
Step 250, based on the field length, searching the head field tree node from the target head field tree.
Specifically, after the header field tree is determined, the header field tree node may be searched for from the header field tree based on the field length corresponding to the header field information. For example, referring to fig. 3, in a case that the header field information is the authorization field information, the header field tree corresponding to the authorization field information may be a request header field tree, the field length corresponding to the authorization field information may be 13, and then a tree node with a field length of 13 may be searched in the request header field tree as a header field tree node corresponding to the header field information.
And step 260, performing character string matching identification based on the head field tree nodes and the head field information to obtain a character string matching identification result.
Specifically, after determining the head field tree node corresponding to the head field information, the embodiment of the present application may match a preset character string included in the head field tree node with a character string in the head field information, where the preset character string may be a character string corresponding to the head field, and if the head field is an authorization field, the character string corresponding to the authorization field is "authorization". And then obtaining a character string matching result corresponding to the header field information.
In a specific implementation, the head field tree node may include a plurality of preset character strings, that is, the head field tree node includes a plurality of head fields with the same length, for example, referring to the head field tree node with the length of 6 in the general head field tree in fig. 3. Each preset character string contained in the head field tree node can be matched with the character string in the head field information one by one, and a character string matching result corresponding to the head field information is obtained.
Optionally, in the embodiment of the present application, performing character string matching and recognition based on the head field tree node and the head field information to obtain a result of character string matching and recognition may specifically include the following substeps:
sub-step 2601, determining whether the preset character string in the node of the header field tree matches with the character string in the header field information.
Sub-step 2602, if the character string of the head field tree node matches the character string in the head field information, generating a matching identification success result.
Sub-step 2603, if the character string of the head field tree node does not match the character string in the head field information, generating a matching identification failure result.
Specifically, the preset character string in the head field tree node may be matched with the character string in the head field information, a successful result of character string matching recognition may be generated when the character string in the head field tree node is matched with the character string in the head field information, and a failed result of character string matching recognition may be generated when the character string in the head field tree node is not matched with the character string in the head field information.
In a specific implementation, under the condition that a head field tree node includes a plurality of preset character strings, the plurality of preset character strings included in the head field tree node may be respectively matched with the character strings in the head field information, if a character string matched with the character string in the head field information exists in the plurality of preset character strings included in the head field tree node, it may be determined that the character string matching is successful, and a result of successful character string matching identification may be generated, and if a character string matched with the character string in the head field information does not exist in the plurality of preset character strings included in the head field tree node, it may be determined that the character string matching is failed, and a result of failed character string matching identification may be generated.
And 270, if the character string matching identification result is a successful matching identification result, taking a successful head domain identification result as the head domain identification result.
Specifically, if the result of the string matching identification is a successful result of the matching identification, it may be determined that the header field information has been successfully identified, and then the successfully matched string may be used as the header identification result corresponding to the header field information. For example, when the string successfully matched with the header field information is "authorization", it may be determined that the header field information is authorization field information, and the authorization field information may be used as the identification result of the header field information.
Step 280, if the string matching identification result is a matching identification failure result, performing hash identification on the header field information to obtain the header field identification result.
In a specific implementation, the acquired message information to be identified may include unused header field information and/or custom header field information, such as accept-field information, upgrade field information, custom header field, and the like, at this time, a preset character string matching the character string in the header field information may not exist in the header field tree node, and the character string in the header field information is matched through the preset character string in the header field tree node, which may obtain a result of failed character string matching identification, thereby causing that the header field information cannot be successfully identified. In order to realize the identification of the non-use header field information and/or the user-defined header field information, hash identification can be carried out on the header field information to obtain a header identification result. Specifically, referring to fig. 4, a hash table including an unused header field and a custom header field may be created in advance, after the matching and identification of the character string fails, the header field information may be matched with the unused header field and the custom header field in the hash table to obtain a matching result, and in the case that the matching result is the matching success result, a header identification success result may be determined, in the case that the matching result is the matching and identification result, a header identification failure result may be determined, and the accuracy of header identification may be improved by using a fast hash identification method.
Optionally, in the embodiment of the present application, when the result of matching and identifying the character string is a result of failed matching and identifying, performing hash identification on the header field information to obtain the header field identification result, specifically, the following sub-steps may be included:
sub-step 2801, obtain a preset hash table.
In actual processing, before obtaining the preset hash table, the embodiment of the present application may specifically include: acquiring system configuration file information; reading protocol header field information from the system configuration file information, wherein the protocol header field information comprises the field information of an unusual header field and the field information of a custom header field; and generating the hash table based on the non-use header field information and the custom header field information. Specifically, the method for identifying a packet header field provided by the embodiment of the present application may be applied to a DPI system, and when the DPI system is initialized, protocol header field information may be read from system configuration file information through system configuration file information, where the protocol header field information may include non-use header field information and custom header field information, and then a hash table may be generated based on the non-use header field information and the custom header field information, that is, refer to the hash table in fig. 4. Whether hash header field information contained in the hash table is matched with header field information or not can be judged subsequently, as shown in fig. 4, hash search can be performed on the header field information so as to obtain a hash header field identification result, the hash table is generated through protocol header field information contained in system configuration file information, and subsequently, when an frequently-used header field or a custom header field is changed, a new hash table can be generated only by modifying the protocol header field information, so that the hash table has good flexibility and expansibility.
Sub-step 2802, determine whether the hash header field information in the hash table matches the header field information.
And a substep 2803, if the hash header field information matches the header field information, taking the successful hash header field identification result as the header identification result.
A substep 2804, if the hash header field information does not match the header field information, taking the hash header identification failure result as the header identification result.
Specifically, after the hash table is obtained, whether hash header field information matched with the header field information exists in the hash table can be judged, if hash header field information matched with the header field information exists in the hash table, a hash header field identification success result can be determined, and then the hash header field identification success result can be used as a header field identification result, that is, the header field identification success result is determined, and the hash header field information can be used as header field identification information corresponding to the message information; if the hash table does not have hash head field information matched with the head field information, a hash head field identification failure result can be determined, the hash head field identification failure result can be used as a head field identification result, namely the head field identification failure result is determined, rapid identification of the unused head field information and the custom head field information through the hash table is achieved, when the follow-up custom head field or the unused field is changed, the hash table can be directly changed, identification of the changed custom head field or the unused head field is achieved, and good flexibility and expansibility are achieved.
To sum up, the embodiment of the present application obtains message information to be identified, extracts header field information from the message information, determines a header type and a field length corresponding to the header field information, determines a target header field tree corresponding to the header field information based on the header type, and searches for a header field tree node from the target header field tree based on the field length, so as to perform string matching identification based on the header field tree node and the header field information, obtain a string matching identification result, perform hash identification on the header field information under the condition that the string matching identification result is a matching identification success result, and perform header field identification on the header field information and/or the custom header field information included in the message information by using a hash identification method, thereby solving the problem of performing HTTP protocol header field identification by using a simple string matching method in the prior art, achieving fast header field identification, and improving the efficiency of the header field identification by using a hash identification method, and improving the accuracy of the header field identification protocol.
It should be noted that for simplicity of description, the method embodiments are described as a series of acts, but those skilled in the art should understand that the embodiments are not limited by the described order of acts, as some steps can be performed in other orders or simultaneously according to the embodiments.
As shown in fig. 5, an embodiment of the present application further provides an apparatus 500 for identifying a header field of a packet, including:
an obtaining module 510, configured to obtain message information to be identified;
an extracting module 520, configured to extract header field information from the message information;
a positioning processing module 530, configured to perform positioning processing on the header field information, and determine a header field tree node corresponding to the header field information;
and the matching identification processing module 540 is configured to perform matching identification processing on the header field information based on the header field tree node, so as to obtain a header identification result corresponding to the packet information.
Optionally, the positioning processing module includes:
a header field type and field length determining submodule for determining the header field type and field length corresponding to the header field information;
a target header field tree determining submodule, configured to determine, based on the header category, a target header field tree corresponding to the header field information;
and the head field tree node searching submodule is used for searching the head field tree nodes from the target head field tree based on the field lengths.
Optionally, the matching identification processing module includes:
the character string matching and identifying submodule is used for carrying out character string matching and identifying on the basis of the head field tree nodes and the head field information to obtain a character string matching and identifying result;
a head domain identification success result determining sub-module, configured to, when the character string matching identification result is a matching identification success result, take the head domain identification success result as the head domain identification result;
and the hash identification submodule is used for carrying out hash identification on the header field information to obtain the header field identification result when the character string matching identification result is a matching identification failure result.
Optionally, the character string matching and identifying sub-module includes:
the character string matching unit is used for judging whether a preset character string in the head field tree node is matched with a character string in the head field information;
a matching identification success result determining unit, configured to generate a matching identification success result when the character string of the head field tree node matches the character string in the head field information;
and the matching identification failure result determining unit is used for generating a matching identification failure result when the character string of the head field tree node is not matched with the character string in the head field information.
Optionally, the hash identification sub-module includes:
a hash table obtaining unit, configured to obtain a preset hash table;
the hash table matching unit is used for judging whether hash head field information in the hash table is matched with the head field information;
a hash header field identification success result determining unit, configured to take a hash header field identification success result as the header field identification result when the hash header field information matches the header field information;
and the hash head domain identification failure result determining unit is used for taking the hash head domain identification failure result as the head domain identification result when the hash head domain field information is not matched with the head domain field information.
Optionally, the extracting module includes:
an end identifier extraction submodule for extracting an end identifier from the message information;
a header field line data determining submodule for determining header field line data based on the end identifier;
and the header field information extraction submodule is used for extracting header field information from the header field line data.
Optionally, before the obtaining the preset hash table, the method further includes:
the system configuration file information acquisition unit is used for acquiring system configuration file information;
a reading unit, configured to read protocol header field information from the system configuration file information, where the protocol header field information includes unusual header field information and custom header field information;
and the hash table generating unit is used for generating the hash table based on the non-use header field information and the user-defined header field information.
It should be noted that the apparatus for identifying a message header field provided in the embodiments of the present application can execute the method for identifying a message header field provided in any embodiment of the present application, and has corresponding functions and beneficial effects of the execution method.
In a specific implementation, the apparatus may be integrated in a device, so that the device may perform header field identification according to the collected message information to obtain a header field identification result, and the header field identification result is used as a message header field identification device to implement identification of a message header field. The device for identifying the header field of the message may be composed of two or more physical entities, or may be composed of one physical entity, for example, the device may be a Personal Computer (PC), a Computer, a server, or the like, which is not specifically limited in this embodiment of the present application.
As shown in fig. 6, an embodiment of the present application provides an apparatus for identifying a message header field, including a processor 111, a communication interface 112, a memory 113, and a communication bus 114, where the processor 111, the communication interface 112, and the memory 113 complete communication with each other through the communication bus 114; a memory 113 for storing a computer program; the processor 111 is configured to implement the steps of the method for identifying a header field of a message provided in any one of the foregoing method embodiments when executing the program stored in the memory 113. Illustratively, the steps of the method for identifying a message header field may include the following steps: acquiring message information to be identified; extracting header field information from the message information; positioning processing is carried out aiming at the header field information, and header field tree nodes corresponding to the header field information are determined; and performing matching identification processing on the header field information based on the header field tree node to obtain a header field identification result corresponding to the message information.
The present application further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method for identifying a message header field as provided in any one of the foregoing method embodiments.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for identifying a message header field is characterized by comprising the following steps:
acquiring message information to be identified;
extracting header field information from the message information;
positioning processing is carried out on the header field information, and header field tree nodes corresponding to the header field information are determined;
and performing matching identification processing on the header field information based on the header field tree node to obtain a header identification result corresponding to the message information.
2. The method of claim 1, wherein the performing the positioning process on the header field information and determining the header field tree node corresponding to the header field information comprises:
determining a header field type and a field length corresponding to the header field information;
determining a target header field tree corresponding to the header field information based on the header category;
and searching the head field tree node from the target head field tree based on the field length.
3. The method according to claim 2, wherein the performing matching identification processing on the header field information based on the header field tree node to obtain a header field identification result corresponding to the packet information includes:
performing character string matching identification based on the head field tree nodes and the head field information to obtain a character string matching identification result;
if the character string matching identification result is a matching identification success result, taking a head domain identification success result as the head domain identification result;
and if the character string matching identification result is a matching identification failure result, performing hash identification on the header field information to obtain the header identification result.
4. The method according to claim 3, wherein said performing string matching identification based on said head field tree node and said head field information to obtain a string matching identification result comprises:
judging whether a preset character string in the head field tree node is matched with a character string in the head field information;
if the character string of the head field tree node is matched with the character string in the head field information, generating a matching identification success result;
and if the character string of the head field tree node is not matched with the character string in the head field information, generating a matching identification failure result.
5. The method of claim 3, wherein the performing hash identification on the header field information to obtain the header identification result comprises:
acquiring a preset hash table;
judging whether hash head field information in the hash table is matched with the head field information;
if the hash head field information is matched with the head field information, taking a successful hash head field identification result as the head field identification result;
and if the hash head field information is not matched with the head field information, taking a hash head field identification failure result as the head field identification result.
6. The method of claim 1, wherein the extracting header field information to be identified from the packet information comprises:
extracting an end identifier from the message information;
determining header field row data based on the end identifier;
and extracting header field information from the header field line data.
7. The method according to claim 5, wherein before the obtaining the preset hash table, the method further comprises:
acquiring system configuration file information;
reading protocol header field information from the system configuration file information, wherein the protocol header field information comprises the field information of an unusual header field and the field information of a custom header field;
and generating the hash table based on the non-use header field information and the custom header field information.
8. An apparatus for identifying a header field of a packet, comprising:
the acquisition module is used for acquiring message information to be identified;
an extraction module, configured to extract header field information from the message information;
a positioning processing module, configured to perform positioning processing on the header field information, and determine a header field tree node corresponding to the header field information;
and the matching identification processing module is used for carrying out matching identification processing on the header field information based on the header field tree node to obtain a header field identification result corresponding to the message information.
9. The device for identifying the message header field is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for implementing the steps of the method for identifying a header field of a message according to any one of claims 1 to 7 when executing a program stored in a memory.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for identifying a header field of a message according to any one of claims 1 to 7.
CN202210412422.4A 2022-04-19 2022-04-19 Message header domain identification method, device, equipment and storage medium Active CN115379026B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210412422.4A CN115379026B (en) 2022-04-19 2022-04-19 Message header domain identification method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210412422.4A CN115379026B (en) 2022-04-19 2022-04-19 Message header domain identification method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115379026A true CN115379026A (en) 2022-11-22
CN115379026B CN115379026B (en) 2024-01-19

Family

ID=84060671

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210412422.4A Active CN115379026B (en) 2022-04-19 2022-04-19 Message header domain identification method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115379026B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101743697A (en) * 2008-01-04 2010-06-16 中央大学校产学协力团 Label identification method, label anti-confliction method and RFID tag
CN102833327A (en) * 2012-08-16 2012-12-19 瑞斯康达科技发展股份有限公司 Method and device for recognizing type of client based on HTTP (hypertext transport protocol)
CN103401777A (en) * 2013-08-21 2013-11-20 中国人民解放军国防科学技术大学 Parallel search method and system of Openflow
CN104320304A (en) * 2014-11-04 2015-01-28 武汉虹信技术服务有限责任公司 Multimode integration core network user traffic application identification method easy to expand
CN105099918A (en) * 2014-05-13 2015-11-25 华为技术有限公司 Method and apparatus for data searching and matching
CN106713182A (en) * 2015-08-10 2017-05-24 华为技术有限公司 Method and device for processing flow table
CN109729223A (en) * 2017-10-30 2019-05-07 中国电信股份有限公司 Call service processing method, device, number changing service platform and storage medium
CN109936624A (en) * 2019-01-31 2019-06-25 平安科技(深圳)有限公司 Adaptation method, device and the computer equipment of HTTP request heading
US20200235981A1 (en) * 2019-01-18 2020-07-23 Hewlett Packard Enterprise Development Lp Using a Recursive Parser Tree to Implement a Smaller Code Segment for an Embedded Simple Network Management Protocol Agent
CN111740946A (en) * 2020-05-09 2020-10-02 郑州启明星辰信息安全技术有限公司 Webshell message detection method and device
US20210092153A1 (en) * 2018-02-05 2021-03-25 Chongqing University Of Posts And Telecommunications Ddos attack detection and mitigation method for industrial sdn network
CN114070761A (en) * 2021-11-11 2022-02-18 北京轨道交通路网管理有限公司 Protocol message detection method, device and electronic equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101743697A (en) * 2008-01-04 2010-06-16 中央大学校产学协力团 Label identification method, label anti-confliction method and RFID tag
CN102833327A (en) * 2012-08-16 2012-12-19 瑞斯康达科技发展股份有限公司 Method and device for recognizing type of client based on HTTP (hypertext transport protocol)
CN103401777A (en) * 2013-08-21 2013-11-20 中国人民解放军国防科学技术大学 Parallel search method and system of Openflow
CN105099918A (en) * 2014-05-13 2015-11-25 华为技术有限公司 Method and apparatus for data searching and matching
CN104320304A (en) * 2014-11-04 2015-01-28 武汉虹信技术服务有限责任公司 Multimode integration core network user traffic application identification method easy to expand
CN106713182A (en) * 2015-08-10 2017-05-24 华为技术有限公司 Method and device for processing flow table
CN109729223A (en) * 2017-10-30 2019-05-07 中国电信股份有限公司 Call service processing method, device, number changing service platform and storage medium
US20210092153A1 (en) * 2018-02-05 2021-03-25 Chongqing University Of Posts And Telecommunications Ddos attack detection and mitigation method for industrial sdn network
US20200235981A1 (en) * 2019-01-18 2020-07-23 Hewlett Packard Enterprise Development Lp Using a Recursive Parser Tree to Implement a Smaller Code Segment for an Embedded Simple Network Management Protocol Agent
CN109936624A (en) * 2019-01-31 2019-06-25 平安科技(深圳)有限公司 Adaptation method, device and the computer equipment of HTTP request heading
CN111740946A (en) * 2020-05-09 2020-10-02 郑州启明星辰信息安全技术有限公司 Webshell message detection method and device
CN114070761A (en) * 2021-11-11 2022-02-18 北京轨道交通路网管理有限公司 Protocol message detection method, device and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
孙琼: ""下一代互联网的报文标识与查找技术的研究"", 《中国优秀博士学位论文》 *
陈曙晖: ""基于内容分析的高速网络协议识别技术研究"", 《中国优秀博士学位论文全文数据库-信息科技辑》 *

Also Published As

Publication number Publication date
CN115379026B (en) 2024-01-19

Similar Documents

Publication Publication Date Title
CN102752288B (en) Method and device for identifying network access action
US20150095359A1 (en) Volume Reducing Classifier
CN101711470A (en) A system and method for creating a list of shared information on a peer-to-peer network
US11888874B2 (en) Label guided unsupervised learning based network-level application signature generation
WO2013159512A1 (en) User behavior analysis method, and related equipment and system
US20150120692A1 (en) Method, device, and system for acquiring user behavior
CN110768875A (en) Application identification method and system based on DNS learning
CN110674427B (en) Method, device, equipment and storage medium for responding to webpage access request
CN103236940A (en) Method and device for content processing and network equipment
CN111209325A (en) Service system interface identification method, device and storage medium
CN114330280A (en) Sensitive data identification method and device
CN105184559B (en) A kind of payment system and method
CN115379026B (en) Message header domain identification method, device, equipment and storage medium
CN111314109A (en) Weak key-based large-scale Internet of things equipment firmware identification method
CN105099996B (en) Website verification method and device
CN115865457A (en) Network attack behavior identification method, server and medium
CN113992364B (en) Network data packet blocking optimization method and system
CN111200666A (en) Method and system for identifying access domain name
CN114697271A (en) Method and device for determining data flow label and related equipment
CN111290804A (en) Service configuration system, service configuration method, device and configuration server
CN112532414A (en) Method, device and equipment for determining ISP attribution and computer storage medium
CN105743992B (en) Information processing method and device
CN106447370B (en) Advertisement material data website verification method and device
CN112866140B (en) Service matching method, gateway management platform, gateway equipment and server
CN114070819B (en) Malicious domain name detection method, device, electronic device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant