CN109688043B - IMAP protocol multi-link association analysis method and system - Google Patents

IMAP protocol multi-link association analysis method and system Download PDF

Info

Publication number
CN109688043B
CN109688043B CN201710975868.7A CN201710975868A CN109688043B CN 109688043 B CN109688043 B CN 109688043B CN 201710975868 A CN201710975868 A CN 201710975868A CN 109688043 B CN109688043 B CN 109688043B
Authority
CN
China
Prior art keywords
mime
block
current
path
header
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710975868.7A
Other languages
Chinese (zh)
Other versions
CN109688043A (en
Inventor
张成伟
刘庆云
刘洋
杨威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201710975868.7A priority Critical patent/CN109688043B/en
Publication of CN109688043A publication Critical patent/CN109688043A/en
Application granted granted Critical
Publication of CN109688043B publication Critical patent/CN109688043B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/42Mailbox-related aspects, e.g. synchronisation of mailboxes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/18Protocol analysers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/06Message adaptation to terminal or network requirements
    • H04L51/066Format adaptation, e.g. format conversion or compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses an IMAP protocol multilink correlation analysis method and system. The method comprises the following steps: 1) acquiring a data structure BODYSTRUCTURE of header information of all MIME blocks describing the whole mail from a TCP link; 2) extracting the MIME block header from the data structure BODYSTRUCTURE, and labeling the MIME block header according to the position of the MIME block header; 3) constructing a unique key KEYx for the head of each MIME block and adding the head information of the MIME block into a hash table; after capturing the flow of the MIME block entity by the TCP link for transmitting the MIME entity, extracting key information from the flow, generating a key consistent with KEYx, and acquiring corresponding MIME header information from the hash table according to the key KEYx. The method has more comprehensive monitoring capability and greatly enhances the efficiency of Hash association.

Description

IMAP protocol multi-link association analysis method and system
Technical Field
The invention belongs to the technical field of computer networks, and relates to an IMAP protocol multilink association analysis method and system.
Background
Some enterprises or organizations with higher security requirements have stronger monitoring requirements on the content transmitted by the internet. The flow entering and exiting the Internet is monitored and audited to find out the attack or the leakage behavior. Email is one of the primary means of interacting files internally and externally to businesses and organizations. With the deterioration of internet security situation and the improvement of confidentiality awareness, more and more network monitoring systems list e-mails into monitoring objects and deploy the monitoring objects at internet entrances and exits of operators, governments, military, enterprises and the like. Generally, a network monitoring system obtains network traffic of an intranet and the internet by a light splitting or mirroring method. The system can detect, analyze and review the content of the e-mail and record the transmission of the e-mail by reforming and analyzing the network flow.
The IMAP4(Internet Message Access Protocol, Version 4) Protocol is one of the major email protocols. One of its main differences from the POP3 protocol is the support for chunking, segmenting, and downloading portions of the mail content. Meanwhile, the IMAP4 protocol abstracts the headers of MIME (multipurp Internet Mail extensions) protocol into a data structure (BODY/BODY system, hereinafter referred to as BODY system, which is an abstraction of the headers of all MIME blocks of an entire Mail) to describe the header information of all MIME blocks of the entire Mail, which is usually transmitted first when the Mail is downloaded, wherein the MIME block information includes MIME block headers and MIME block entities. The MIME block header is used to describe the entity of the MIME block; MIME blocks support multi-level nesting, i.e., one entire MIME block can be an entity of another MIME block. Thus, for email transmission based on the IMAP4 protocol, the MIME block entity and the bodytransmission would be transmitted in different TCP links. This is more common in e-mail applications of mobile intelligent terminals. In the existing solution, either the protocol is analyzed for the whole mail transmitted in one TCP link, or the transmission of multiple TCP links is not considered. Patent application No. CN106385358A mentions a method for verifying and evidence-obtaining email data packets, but parsing bodysystem is only one way, and does not describe parsing method, and does not support data acquisition method in block downloading, nor does it mention multi-link transmission.
In summary, in order to more fully monitor the email based on the IMAP4 protocol, it is necessary to design a mail transmission method supporting IMAP4 bodytransmission and an associated protocol parsing system supporting multiple TCP links.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention aims to provide an IMAP protocol multilink association analysis method and system. The invention can obtain the BODYSTRUCTURE structure from the network flow and extract the MIME block header information; MIME block entity information can be obtained and MIME block header information can be associated with the MIME block entity information to complete parsing of the IMAP4 protocol.
The main content of the invention comprises:
1. the same and unique key value is constructed for the MIME block head part extracted from any TCP link and the captured MIME block entity, and the hash algorithm is utilized to carry out association, thereby completing the analysis of the MIME block entity according to the MIME block head part information.
2. According to the structural form of BODYSTRUCTURE, the beginning, the end, the nesting depth and the parallel layer number of the MIME block header are identified, a path level label derivation method is designed, and a path level label is generated for each MIME block header.
3. According to the structural form of the BODYSTRUCTURE, a MIME block header separation and MIME block header field extraction two-stage nested state machine is constructed.
Compared with the prior art, the invention has the following positive effects:
1. the data acquisition during the segmentation and block downloading of the IMAP4 protocol is supported, and the head part of the MIME block corresponding to the downloaded MIME block entity can be acquired through a Hash correlation technique to complete the analysis of the MIME block; has more comprehensive monitoring capability.
2. The characteristic that the MIME block is labeled when IMAP4 is downloaded in blocks is fully utilized, and the path hierarchy labeling is carried out on the MIME block head extracted from the BODYSTRUCTURE, namely the MIME block label, so that the efficiency of hash association is greatly enhanced.
Drawings
FIG. 1 is a diagram of an IMAP protocol multilink association;
fig. 2 is a diagram illustrating the MIME NON-multiple block field extraction state flow.
Detailed Description
In order to make the aforementioned and other features and advantages of the invention more comprehensible, embodiments accompanied with figures are described in detail below.
In order to carry on more comprehensive, more effective monitoring to IMAP E-mail transmission at Internet access of enterprises or organizations, the invention has proposed a correlation method to IMAP E-mail multilink transmission, while IMAP E-mail is blocked, segmented to download, can describe MIME block header and MIME block entity of all MIME block information of the whole mail associate, according to MIME block header to the description of MIME block entity, finish the analysis of MIME block entity.
In the case of a multi-link transmission, it is possible that the BODYSTRUCTURE request is in one of the TCP links, while the body or attachment content is in the other TCP link, and the handler at the intermediate node must capture the link data of the transmitted BODYSTRUCTURE to be able to apply it to the resolution of the entity content data in the other link. The IMAP4 protocol analysis is based on TCP link management, and for the cross-link condition, the protocol analysis maintains a global information table for associating a plurality of TCP links of a mail.
The frame diagram of the whole system is shown in fig. 1, and is mainly divided into three sub-modules:
(1) BODYSTRUCUTURE parse state machine. Because the MIME header field is variable, the attribute and the value of the field are not fixed, and partial fields are not optional, the change of a BODYSTTRUCUTURE character string is large, a fixed character string format is not available, and complex MIME nested header extraction is difficult to realize through ordinary character string analysis.
(2) And generating a path identifier. The IMAP4 protocol specifies the reference number of the requested block when the MIME block entity download is performed via the block reference number (the IMAP4 protocol references the MIME block), such as BODY [1.2 ]. When the MIME block header is extracted from the BODYSTRUCTURE, the invention automatically marks the MIME block header according to the position and nesting of the MIME block header (namely, the marking is carried out according to the appearance sequence and nesting condition of the MIME block header in the BODYSTRUCTURE), and ensures that the mark is consistent with the mark of the entity corresponding to the MIME block.
(3) And (6) carrying out hash association. After a plurality of MIME blocks are obtained by BODYSTTRUCUTURE extraction of a mail, a unique key KEYx is constructed for the head of each MIME block, and the head information of the MIME block is added into a hash table. And after capturing the flow of the MIME block entity, extracting key information from the flow to generate a key consistent with KEYx, thereby acquiring the stored MIME header information in the hash table.
After the above steps are completed, the IMAP global body system information table stores the MIME block header information of all IMAP4 mails flowing through the internet gateway, and any TCP link transmitting the MIME entity can find the corresponding MIME block header by looking up the hash table, thereby completing protocol analysis.
The processing procedure of each step is explained in detail below.
(1) BODYSTRUCUTURE parse state machine. The method is divided into a MIME block separation state machine and a MIME block header field extraction two-stage state machine.
MIME blocks support nesting (refer to RFC2045-RFC2049), so there are also cases of nesting in the MIME header in the bodystrucure. The IMAP protocol supports requesting arbitrary MIME block numbers, so that the header of a MIME block must first be stripped from the bodysystem header, and then the information of each field must be extracted from the stripped MIME block header. The invention designs a state machine aiming at the two steps, namely a MIME block separation state machine and a MIME block header field extraction two-stage state machine. The input data of the state machine is the body control response data, which includes two characters, namely a pair of parentheses "()" and its internal data (not including "#" and following number labels, line wrapping characters), after the character string of "body control", and the MIME block separation and path labels in body control are exemplified as follows:
Figure BDA0001438447170000041
description 1: the "#" character and the labels following it in the example do not belong to the content of the original packet, but are the derivation results of the path-level labels assigned to each MIME block by the present invention, so as to clearly explain the derivation method of the path-level labels.
Description 2: the 0 reference is the top level MIME block reference, representing the retrieval of the entire mail.
a) MIME Block splitting State machine steps
Step 1101: initializing a context environment, setting a previous state variable prestate of the MIME block separation state machine to be an IDLE state, executing step 201, executing step 1102, and jumping to step 1103;
step 1102: skipping the blank space and the tab character of the input data, if the input is finished, finishing the circulation of the MIME block separation state machine, and otherwise returning to the use of the call;
step 1103: (IDLE state) recording the position pstart of the current pointer, and adding 1 to the current pointer; if the next character is not the parenthesis "(", then a single block mark variable single is set;
step 1104: (START state) setting prestate to START state; if the current character is not the left parenthesis "(", jumping to step 1105; otherwise, if prestate is START, executing step 202 operation, otherwise, executing step 203 operation; setting the MIME block head START character pointer variable pstart to point to the current position, adding 1 to the current pointer, executing step 1102, and continuing to execute step 1104;
step 1105: (PROC state) setting prestate to PROC state, starting from the current position, looking up the character "backwards") until a complete MIME block header is fetched (i.e. "forms a closed parenthesis" with the symbol queried in step 1104 or step 1106), executing step 1102, updating the current pointer to point to the next character; executing the operation of the step 204, and jumping to the step 1106;
step 1106: (END state) setting prestate as END state, if the current character is "(", executing step 205 operation, setting pstart to point to the current position, jumping to step 1104, otherwise, the current block is a MULTIPART block, setting pstart to point to the current position, starting from the current position, searching character ") backwards, until a complete MIME block head is obtained, namely, when the symbol" ("forms a closed round bracket in step 1104 or step 1106 is inquired each time, executing step 206 operation, adding 1 to the current pointer, if the input is not finished, jumping to step 1104, otherwise, finishing the MIME block separation state machine.
b) MIME block header field extraction State machine step
The IMAP protocol specifies an ordering order for the fields of a MIME block header of the bodystructrue. For example, for a Non-multiple block (i.e., a Non-multiple block), the order of the basic fields (i.e., fields) defined by the header is: main type, sub type, main parameter combination list, main ID, main description, main transmission code, main size, and the defined order of the extended field is envelope (optional), BODYSTRUCTURE (optional), row number (optional), main MD5, main deployment, main language, main positioning, some field units may be empty, some field may be divided into several sub-units, for example, main deployment may contain attachment name. The extension field is optional. Thus, the MIME field extraction state machine is a process of extracting one by one in the sequence of MIME headers, as shown in fig. 2.
The MIME block header field extraction state machine mainly comprises the following steps:
step 1201: judging whether the current MIME block is a MULTIPART block, if so, executing step 1203, otherwise, executing step 1202 if not, executing Non-MULTIPART block (namely, Non-MULTIPART block);
step 1202: extracting the fields one by one according to the sequence of the MIME block header fields, judging whether the optional fields exist according to the context (specified in RFC 3501), and determining whether to jump to the next field state, as shown in FIG. 2; wherein, for the parameter combination list, the values of attributes CHARSET and NAME need to be extracted; for the main body coding, the coding type needs to be extracted; for subject deployment, the value of FILENAME needs to be extracted; after the extraction is completed, step 301 is executed to add the MIME block header information to the hash table.
Step 1203: extracting one by one according to the sequence of the MIME header fields, judging whether the optional fields exist according to the context (RFC 3501), and jumping to the next field state according to specific conditions; wherein, for the parameter combination list, the value of the BOUNDARY attribute needs to be extracted. After the extraction is completed, step 301 is executed to add the MIME block header information to the hash table.
(2) And generating a path identifier.
The invention generates a path hierarchy label for each MIME block header in the BODYSTRUCTURE, wherein the path hierarchy label is consistent with the label of the corresponding MIME block entity of the IMAP4 protocol response. The labels following the annotation character "#" in the above example are the path labels for the blocks. In particular, we define the label "0" as the top level label, i.e. the label of the BODY of the mail, i.e. the path label of the block is "0" when the requested content is rfc822.TEXT or BODY [ TEXT ].
The marking algorithm generation interacts with the MIME block separation state machine operations (rather than being performed sequentially), where one of the following steps is performed as needed.
Step 201: initializing a current path label path as '0';
step 202: the path is pushed, and then a character string ". 1" (namely, a point number and a number 1) is added after the path character string;
step 203: if the path is '0', the MIME header at the top level is described, and the path is pushed; then adding 1 to the number after the last point number of the path, and if no point number exists, directly adding 1 to the number;
step 204: extracting the head of the current MIME block, wherein the starting position is pstart, the ending position is the current position (excluding the character of the position pointed by the current pointer), and the path label of the current block is path; extracting header field information of the current MIME block, namely executing step 1201; in particular, if the single variable in step 1103 is set, then step 1201 is executed simultaneously (step 1201 is executed once more when the single variable is set; i.e. all header fields are extracted from this MIME block data), except that the input path label is not path, but character string "1", which represents that the mail has only body;
step 205: adding 1 to the number after the last ". multidot.H" number of the path, and if the ". multidot.H" number is not available, directly adding 1 to the number;
step 206: and popping the stack, popping the first element at the top of the stack, and storing the first element in the path. The current MIME block is a multpart block, the start of the block header information string is pstart, the end position is the current position (not included), and the block path label is path. Extraction of MIME header field information of the current block is started, i.e., step 1201 is performed.
(3) And (6) carrying out hash association.
The hash table association is divided into two parts, one part is the process of adding elements into the hash table after parsing the BODYSTRUCTURE (step 301), and the other part is the process of performing hash table lookup when capturing IMAP email content which is downloaded in blocks and segments (step 302). These two steps are not performed sequentially, but in separate occasions.
The hash table key is generated by combining the IMAP e-mail user name, the mail unique identifier UID and the MIME block path label path according to the sequence and the fixed format, and the combined result is the key. The same key is used when adding the hash table and looking up the hash table.
Step 301: generating a key, creating a hash table element, storing the MIME block header information (namely, the value) extracted in the step 1202 and the step 1203, and adding the element into the hash table;
step 302: and generating a key, searching the hash table and acquiring the MIME block header information. If the current block is MULTIPAT, extracting the information of attribute BOUNDARY in the hash table, and carrying out MIME decoding; in the example, if the currently requested content is BODY [1], its corresponding block is "# 1". If the current block is not the MULTIPART block, entity decoding is carried out according to the coding type, and the character set type is associated according to the CHARSET attribute; in the example, if the requested content is BODY [1.1], the corresponding block is "# 1.1", the transmission coding type of the current block (i.e. the MIME block corresponding to BODY [1.1 ]) can be obtained as "BASE 64", and the character set is "GBK". If the current MIME block is an attachment, then an attachment name is associated, for example, a #2 block in the example, and the corresponding request content is BODY [2], and after performing BASE64 decoding on the filenamee attribute, the attachment name of the current block is "ghash.h" and the character set of the attachment name is "GBK".
The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and a person skilled in the art can make modifications or equivalent substitutions to the technical solution of the present invention without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.

Claims (10)

1. An IMAP protocol multilink association analysis method comprises the following steps:
1) acquiring a data structure BODYSTRUCTURE of header information of all MIME blocks describing the whole mail from a TCP link;
2) extracting the MIME block header from the data structure BODYSTRUCTURE, and labeling the MIME block header according to the position of the MIME block header;
3) constructing a unique key KEYx for the head of each MIME block and adding the head information of the MIME block into a hash table; after capturing the flow of the MIME block entity by the TCP link for transmitting the MIME entity, extracting key information from the flow, generating a key consistent with KEYx, and acquiring corresponding MIME header information from the hash table according to the key KEYx.
2. The method of claim 1 wherein the extracting MIME block header information from the data structure bodyystrculture and labeling MIME block header information according to its location is by: setting a two-stage state machine comprising a MIME block separation state machine and a MIME block header field extraction state machine;
a) the steps of the MIME block detach state machine comprise:
step 1101: setting the last state variable prestate of the MIME block separation state machine as an IDLE state, and initializing the current path label path as a top-level label "0"; then, step 1102 and step 1103 are executed;
step 1102: skipping the blank space and the tab character of the input data, and ending the circulation of the state machine if the input is finished;
step 1103: recording the position pstart of the current pointer, and adding 1 to the current pointer; if the next character is not the parenthesis "(", then a single block mark variable single is set;
step 1104: setting prestate to START state; if the current character is not the left parenthesis "(", jumping to step 1105, if the current character is the left parenthesis "(" and the state prestate is the START state, then the path label path is pushed, the character string ". 1" is added after the path label path, otherwise, if the current character is the left parenthesis "(", the number after the last point number ". multidot." of the path label path is added with 1, if no point number ". multidot.", the number is directly added with 1, then if the current path label path is "1", then the path label "0" is pushed, then setting the MIME block head START character pointer variable pstart to point to the current position, adding 1 to the current pointer, and executing step 1104 after step 1102;
step 1105: setting a state variable prestate as a PROC state, starting from a current position, searching characters ") backwards until a complete MIME block header is obtained, executing step 1102, and updating a current pointer to point to a next character; then extracting data blocks between the pstart and the current position, namely the header field information of the MIME block, taking the path as the current path label, extracting the MIME header field information of the current block, namely executing step 1201; if the single variable is set, taking '1' as the current path label, and executing step 1201 again; jumping to step 1106;
step 1106: setting prestate as END state, if the current character is "(", adding 1 to the number after the last ". multidot." number of path, if no ". multidot.", directly adding 1 to the number, setting pstart to point to the current position, skipping to step 1104, otherwise, the current block is a MULTIPART block, setting pstart to point to the current position, starting from the current position, searching the character ") backwards, until a complete MIME block head is obtained, then executing a stack-out operation, popping up the first element at the top of the stack, storing a path label path, and starting to extract MIME head field information of the current block, namely executing step 1201; adding 1 to the current pointer, if the input is not finished, jumping to a step 1104, otherwise, finishing the MIME block separation state machine;
b) the MIME block header field extraction state machine comprises the following steps:
step 1201: judging whether the current MIME block is a MULTIPART block, if so, executing a step 1203, otherwise, executing a step 1202;
step 1202: extracting one by one according to the sequence of the head fields of the MIME blocks, and judging whether the optional fields exist and determining whether to jump to the next field state or not according to the context; wherein, for the parameter combination list, the values of attributes CHARSET and NAME need to be extracted; for the main body coding, the coding type needs to be extracted; for subject deployment, the value of FILENAME needs to be extracted; after extraction, adding the MIME block header information into a hash table;
step 1203: extracting one by one according to the sequence of the MIME head fields, and judging whether the optional fields exist and determining whether to jump to the next field state or not according to the context; for the parameter combination list, the value of the BOUNDARY attribute needs to be extracted; and after extraction is finished, adding the MIME block header information into the hash table.
3. The method of claim 2, wherein if the current MIME block is a multpart block, extracting information of attribute BOUNDARY in the hash table for MIME decoding; if the current MIME block is not the MULTIPART block, carrying out entity decoding according to the coding type; if the current MIME block is an attachment, then the attachment name is associated.
4. The method of claim 1 or 2 or 3, wherein the method of generating the hash table key KEYx for the MIME block header is: and combining the IMAP e-mail user name, the mail unique identifier UID and the MIME block path label path corresponding to the MIME block header according to the sequence and the set format, wherein the combination result is a hash table key KEYx of the MIME block header.
5. The method of claim 1 wherein each MIME block header portion in said data structure bodyystrcuture generates a path level label that is consistent with the label of the corresponding MIME block entity of the IMAP4 protocol reply.
6. An IMAP protocol multilink correlation analysis system is characterized by comprising a data structure BODYSTRUCTURE extraction module and a BODYSTRUCTURE analysis state machine; wherein the content of the first and second substances,
the data structure BODYSTRUCTURE extraction module is used for acquiring the data structure BODYSTRUCTURE of the header information of all MIME blocks describing the whole mail from the TCP link;
a BODYSTTRUCUTURE parsing state machine for extracting the MIME block header from the data structure BODYSTTRUCUTURE and labeling the MIME block header according to the position of the MIME block header; constructing a unique key KEYx for the head of each MIME block and adding the head information of the MIME block into a hash table; after capturing the flow of the MIME block entity by the TCP link for transmitting the MIME entity, extracting key information from the flow, generating a key consistent with KEYx, and acquiring corresponding MIME header information from the hash table according to the key KEYx.
7. The system of claim 6, wherein the BODYSTRUCUTURE resolution state machine is a two-level state machine comprising a MIME block separation state machine and a MIME block header field extraction state machine; wherein the content of the first and second substances,
the steps of the MIME block detach state machine comprise:
step 1101: setting the last state variable prestate of the MIME block separation state machine as an IDLE state, and initializing the current path label path as a top-level label "0"; then, step 1102 and step 1103 are executed;
step 1102: skipping the blank space and the tab character of the input data, and ending the circulation of the state machine if the input is finished;
step 1103: recording the position pstart of the current pointer, and adding 1 to the current pointer; if the next character is not the parenthesis "(", then a single block mark variable single is set;
step 1104: setting prestate to START state; if the current character is not the left parenthesis "(", jumping to step 1105, if the current character is the left parenthesis "(" and the state prestate is the START state, then the path label path is pushed, the character string ". 1" is added after the path label path, otherwise, if the current character is the left parenthesis "(", the number after the last point number ". multidot." of the path label path is added with 1, if no point number ". multidot.", the number is directly added with 1, then if the current path label path is "1", then the path label "0" is pushed, then setting the MIME block head START character pointer variable pstart to point to the current position, adding 1 to the current pointer, and executing step 1104 after step 1102;
step 1105: setting a state variable prestate as a PROC state, starting from a current position, searching characters ") backwards until a complete MIME block header is obtained, executing step 1102, and updating a current pointer to point to a next character; then extracting data blocks between the pstart and the current position, namely the header field information of the MIME block, taking the path as the current path label, extracting the MIME header field information of the current block, namely executing step 1201; if the single variable is set, taking '1' as the current path label, and executing step 1201 again; jumping to step 1106;
step 1106: setting prestate as END state, if the current character is "(", adding 1 to the number after the last ". multidot." number of path, if no ". multidot.", directly adding 1 to the number, setting pstart to point to the current position, skipping to step 1104, otherwise, the current block is a MULTIPART block, setting pstart to point to the current position, starting from the current position, searching the character ") backwards, until a complete MIME block head is obtained, then executing a stack-out operation, popping up the first element at the top of the stack, storing a path label path, and starting to extract MIME head field information of the current block, namely executing step 1201; adding 1 to the current pointer, if the input is not finished, jumping to a step 1104, otherwise, finishing the MIME block separation state machine;
the MIME block header field extraction state machine comprises the following steps:
step 1201: judging whether the current MIME block is a MULTIPART block, if so, executing a step 1203, otherwise, executing a step 1202;
step 1202: extracting one by one according to the sequence of the head fields of the MIME blocks, and judging whether the optional fields exist and determining whether to jump to the next field state or not according to the context; wherein, for the parameter combination list, the values of attributes CHARSET and NAME need to be extracted; for the main body coding, the coding type needs to be extracted; for subject deployment, the value of FILENAME needs to be extracted; after extraction, adding the MIME block header information into a hash table;
step 1203: extracting one by one according to the sequence of the MIME head fields, and judging whether the optional fields exist and determining whether to jump to the next field state or not according to the context; for the parameter combination list, the value of the BOUNDARY attribute needs to be extracted; and after extraction is finished, adding the MIME block header information into the hash table.
8. The system of claim 7, wherein if the current MIME block is a multpart block, extracting information of attribute BOUNDARY in the hash table for MIME decoding; if the current MIME block is not the MULTIPART block, carrying out entity decoding according to the coding type; if the current MIME block is an attachment, then the attachment name is associated.
9. The system of claim 7, wherein the IMAP email username, the mail unique identifier UID, and the MIME block path label path corresponding to the MIME block header are combined in order and in a set format, the result of the combination being the hash table key KEYx of the MIME block header.
10. The system according to any of claims 6 to 9, wherein said bodysyntax parsing state machine generates a path level label for each MIME block header in said data structure bodysyntax, said path level label corresponding to the label of the corresponding MIME block entity of the IMAP4 protocol reply.
CN201710975868.7A 2017-10-19 2017-10-19 IMAP protocol multi-link association analysis method and system Active CN109688043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710975868.7A CN109688043B (en) 2017-10-19 2017-10-19 IMAP protocol multi-link association analysis method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710975868.7A CN109688043B (en) 2017-10-19 2017-10-19 IMAP protocol multi-link association analysis method and system

Publications (2)

Publication Number Publication Date
CN109688043A CN109688043A (en) 2019-04-26
CN109688043B true CN109688043B (en) 2020-05-22

Family

ID=66183490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710975868.7A Active CN109688043B (en) 2017-10-19 2017-10-19 IMAP protocol multi-link association analysis method and system

Country Status (1)

Country Link
CN (1) CN109688043B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112256801B (en) * 2020-10-10 2024-04-09 深圳力维智联技术有限公司 Method, system and storage medium for extracting key entity in entity relation diagram

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105407094A (en) * 2015-11-23 2016-03-16 广东数字证书认证中心有限公司 Method and device for improving safety of e-mail, safe e-mail agent system
CN106385358A (en) * 2016-09-06 2017-02-08 四川秘无痕信息安全技术有限责任公司 Method for checking and evidence collection aiming at E-mail data packet
CN107171950A (en) * 2017-07-20 2017-09-15 国网上海市电力公司 A kind of Email Body threatens the recognition methods of behavior

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8667074B1 (en) * 2012-09-11 2014-03-04 Bradford L. Farkas Systems and methods for email tracking and email spam reduction using dynamic email addressing schemes

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105407094A (en) * 2015-11-23 2016-03-16 广东数字证书认证中心有限公司 Method and device for improving safety of e-mail, safe e-mail agent system
CN106385358A (en) * 2016-09-06 2017-02-08 四川秘无痕信息安全技术有限责任公司 Method for checking and evidence collection aiming at E-mail data packet
CN107171950A (en) * 2017-07-20 2017-09-15 国网上海市电力公司 A kind of Email Body threatens the recognition methods of behavior

Also Published As

Publication number Publication date
CN109688043A (en) 2019-04-26

Similar Documents

Publication Publication Date Title
CN107908541B (en) Interface testing method and device, computer equipment and storage medium
US8577817B1 (en) System and method for using network application signatures based on term transition state machine
US8494985B1 (en) System and method for using network application signatures based on modified term transition state machine
US9305055B2 (en) Method and apparatus for analysing data packets
JP5572763B2 (en) Website scanning apparatus and method
CN109800258B (en) Data file deployment method, device, computer equipment and storage medium
US20180254968A1 (en) Mobile application identification in network traffic via a search engine approach
CN108021598B (en) Page extraction template matching method and device and server
CN110286917A (en) File packing method, device, equipment and storage medium
US11647032B2 (en) Apparatus and method for classifying attack groups
CN108491715B (en) Terminal fingerprint database generation method and device and server
CN109669795A (en) Crash info processing method and processing device
WO2021093673A1 (en) E-mail sending method, apparatus and device, and computer-readable storage medium
CN104320312A (en) Network application safety test tool and fuzz test case generation method and system
CN108076017A (en) The protocol analysis method and device of a kind of data packet
CN110597706A (en) Method and device for testing application program interface data abnormity
KR20190058141A (en) Method for generating data extracted from document and apparatus thereof
CN116634046A (en) Message processing method and device, electronic equipment and storage medium
CN105282094B (en) A kind of collecting method and system
US11768759B2 (en) Method and system for automated testing of web service APIs
CN109688043B (en) IMAP protocol multi-link association analysis method and system
CN114598597A (en) Multi-source log analysis method and device, computer equipment and medium
CN108667768A (en) A kind of recognition methods of network application fingerprint and device
CN102984242A (en) Automatic identification method and device of application protocols
CN102750287B (en) Include method and the download authentication server of index information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant