CN101795230B - Network flow recovery method - Google Patents

Network flow recovery method Download PDF

Info

Publication number
CN101795230B
CN101795230B CN201010112581XA CN201010112581A CN101795230B CN 101795230 B CN101795230 B CN 101795230B CN 201010112581X A CN201010112581X A CN 201010112581XA CN 201010112581 A CN201010112581 A CN 201010112581A CN 101795230 B CN101795230 B CN 101795230B
Authority
CN
China
Prior art keywords
session
flow
packet
tcp
tcp session
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201010112581XA
Other languages
Chinese (zh)
Other versions
CN101795230A (en
Inventor
郑庆华
倪华
陶敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN201010112581XA priority Critical patent/CN101795230B/en
Publication of CN101795230A publication Critical patent/CN101795230A/en
Application granted granted Critical
Publication of CN101795230B publication Critical patent/CN101795230B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a network flow recovery method, which belongs to the technical field of Internet and discloses a method for recovering network flow to a file. The method adopts a two-stage parallel strategy method to fully make the best of the processing efficiency of a multi-core computer, which comprises the following steps: firstly acquiring high-speed flow, adopting a Mac address xor and IP address xor method to split flow twice, and resolving the acquired initial flow into a plurality of thin flows to realize the two-stage parallel resolution of the flow; then adopting a 'producer-consumer model' loose coupling multithreading framework between working modules with data transfer in a thin flow recovery flow path to realize parallel on a threading level. In addition, the method realizes load balancing on each thin flow, and supports IPv4 and IPv6 protocols simultaneously. The invention aims to solve the problem of converting 'invisible' network flow into information which can be directly processed by a computer under high-speed network bandwidth, and provides technical support for identifying and blocking illegal network information transmission.

Description

A kind of network flow recovery method
Technical field
The present invention relates to Internet technical field, relate in particular to a kind of method of giving full play to reduction network traffics to the application layer file of existing multinuclear scheming calculation engine efficiency through network flow quantity shunting with parallel mode.
Background technology
Since the reform and opening-up; Maintain sustained and rapid growth and the development of internet technology of China's economy promoted the development of the Internet cause jointly; On July 16th, 2009, CNNIC (CNNIC) issue report shows that the first half of this year, China netizen scale broke through 300,000,000.Netizen's scale, broadband number of netizen, three indexs of national TLD registration amount (1,296 ten thousand) still are sure to occupy the first in the world, and Internet penetration promotes steadily.
Simultaneously, the network information security becomes the problem that current all circles extremely pay close attention to, and CNNIC " report " shows; China netizen uses ratio higher in amusement, information and social class at present, and except forum/BBS, the popularity rate of these three types of network applications in the netizen is all more than 50%; Particularly when network user's number up to after several hundred million; The information of propagating on the Internet explosive growth especially, therefore how information transmitted on the supervisory network prevents that the quick propagation of flame etc. from being a serious problem; This problem faces two main difficulties now: at first; Network traffics (packet) in transmission through network are that program can't directly be understood, and promptly the application layer file of transmission over networks is " invisible " to computer, must be converted into the accessible information format of computer through processing; Secondly, the demand of development of internet technology and the network bandwidth impels the network equipment to develop to more at a high speed direction, and therefore how solving the real-time traffic collection of computer under the high bandwidth network environment, to reduce bottleneck be the problem of a sternness equally.
The present invention is intended to solution and under the express network bandwidth, converts " invisible " network traffics into the computer directly problem of information processed, for discerning and blocking bad illegal spreading network information technical support is provided.
Summary of the invention
The objective of the invention is to solve the problem of transmitting data " invisible " and the reduction of the real-time traffic collection under high bandwidth network environment bottleneck in the existing network flow; And the concurrent computing advantage of the processor multinuclear that makes full use of active computer; The disposal ability of lifting means, this invention adopt the treatment effeciency of secondary paralleling tactic method with the existing multi-core computer of abundant excavation: at first the original flow of second diffluence is that the one-level that a plurality of refined net flows are implemented on the traffic class walks abreast through catching also; The secondary that in refined net flow reduction flow process, has the multi-threaded architecture that adopts " producer-consumer's model " loose coupling between the operational module of data passes to be implemented on the thread level then is parallel.This invention is resolved the application layer file of the network traffics of reduction " invisible " to " visible " through TCP session reorganization and application layer protocol, with regard to the propagation for flame on identification and the blocking-up network technical support is provided like this.In addition, this method realizes intelligent load balancing to the reduction flow process of each refined net flow, and supports IPv4 and IPv6 agreement simultaneously.
The technical solution adopted for the present invention to solve the technical problems is: a kind of network flow recovery method is characterized in that:
This network flow recovery method at first is split into a plurality of coarse grid flows through traffic capture modules capture primitive network flow and through MAC Address being done Hash operation with original flow; Meanwhile each coarse grid flow is further shunted again, be split into a plurality of refined net flows through the flow diverter module; Adopt TCP session recombination module to carry out TCP session reorganization to the IP packet (IPv4 and IPv6) of each refined net flow then and obtain the TCP session; Upgrade the session status of corresponding session simultaneously; Session status changes into and finishes or when overtime, this session will be by unloading in TCP session formation to be restored; Adopt the intelligently parsing recovery module to extract the session in the TCP session formation to be restored according to the order of sequence at last; Revert to the application layer file to this session through protocol analysis, decoding and decompression operation; This application layer fileinfo and respective session information are committed to database, and the load balancing of this refined net flow is realized on the ground of intelligently parsing recovery module automation simultaneously;
0,1 above-mentioned traffic capture module is provided with coarse grid flow number m according to server computation core number, coarse grid flow numbering correspondingly is set is followed successively by: ..., m-1; This module may further comprise the steps:
(1-1), this module grasps raw data packets from express network;
(1-2), the packet that grabs is carried out protocol filtering, only keep IPv4 and IPv6 packet;
(1-3), each IP packet is extracted SMAC (source MAC) and two field values of DMAC (target MAC (Media Access Control) address);
(1-4), SMAC and DMAC is result behind the XOR m (coarse grid flow number) is got surplus, the result who obtains is this IP packet with the coarse grid flow numbering that is divided to, and then this IP packet is diverted to the coarse grid flow with this numbering;
Above-mentioned flow diverter module all carries out flow shunting to each coarse grid flow, and according to coarse grid flow number m, and it is n that the refined net flow number that each coarse grid flow is divided into is set, and refined net flow numbering is followed successively by: 0 ..., n-1; A coarse grid flow shunted may further comprise the steps:
(2-1), from this coarse grid flow, extract packet;
(2-2), each IP packet is extracted SIP (source IP address) and two field values of DIP (purpose IP address);
(2-3), SIP and DIP are done XOR and the result got surplus to n, the result who obtains is this IP packet with the refined net flow numbering that is divided to, and then this IP packet is diverted to the refined net flow with identical numbering;
The four-tuple that above-mentioned TCP session recombination module is formed with source IP address SIP (SIPv4 or SIPv6), purpose IP address D IP (DIPv4 or DIPv6), source port number (SP) and the destination slogan (DP) of TCP session is key; And calculate cryptographic hash through XOR with above-mentioned four parameters, create respectively corresponding to IPv4 with corresponding to the TCP session hash table of IPv6; The node of hash table is a TCP session, and a TCP session comprises two chained lists, deposits the packet of this TCP session source end transmission and the packet that destination sends respectively, and the packet in each chained list is all pressed TCP sequence number and confirmed number arrangement again; This module may further comprise the steps:
(3-1), the IP packet of TCP recombination module after obtaining the ip fragmentation recombination module and handling, extract its IP protocol version (IPv4 or IPv6); Judge whether it is the tcp data bag, if then change step (3-2); Otherwise change step (3-1);
(3-2), the four-tuple (key) in the extraction packet; Search the TCP session that whether has same four-tuple (key) through hash calculation and in corresponding session hash table (corresponding to IPv4 or IPv6); If then obtain the TCP session of this key and change step (3-3); Otherwise change step (3-4);
(3-3), obtain field values such as FIN in this packet TCP packet header, RST, ACK, sequence number, affirmation number, window size; Utilize field values such as FIN, RST, ACK, sequence number, affirmation number of above-mentioned acquisition, the window size sorting operation of recombinating in the TCP session that in step (3-2), finds this packet then; And upgrade the state of this TCP session; If state becomes end, replacement or overtime, then change step (3-5), otherwise change step (3-1);
(3-4), owing in corresponding session hash table (corresponding to IPv4 or IPv6), there is not the TCP session of same Key; So in this session hash table (corresponding to corresponding IP protocol version), create hash table (being Hash table) node (being the TCP session), change step (3-1) with corresponding K ey;
(3-5), the TCP session is taken out from the session hash table corresponding with it and transferred in the pending TCP session formation;
(3-6), in the process of TCP reorganization, start a timer; Whenever the time that perhaps set up on their own at a distance from 5 seconds goes to scan two TCP session hash tables that correspond respectively to IPv4 and IPv6; There not be to upgrade the TCP session of (promptly not having new packet arrival) in the time that remove to surpass 10 seconds or set up on their own, and the session that is eliminated is transferred in the pending TCP session formation;
Above-mentioned intelligently parsing recovery module may further comprise the steps:
(4-1), calculate the load factor I of this refined net flow, and relatively with this index and given load threshold (be defaulted as 0.95 or set up on their own): if I greater than load threshold, then this refined net flow load state be heavy duty, commentaries on classics (4-2); If I is smaller or equal to load threshold, then this refined net flow load state is a underloading, changes (4-3);
(4-2), order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates, be stored in the session cache file this session is complete then, simultaneously the buffer memory session number added one, change (4-1);
(4-3), judge whether the buffer memory session number is zero, if change (4-4); Otherwise, change (4-5);
(4-4), order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates; And session information such as this session time corresponding of intellectual analysis, application layer protocol, session data size; Then this session is submitted to respective application layer protocol processing module; Resolve, decode, decompress and handle, change (4-6) to accomplish reduction;
The session of buffer memory when (4-5), the order extraction system is in heavy condition from the session cache file; And the buffer memory session number subtracted one; Session informations such as the corresponding application layer protocol of this session of intellectual analysis, session data size then; And according to session information this session is submitted to the application corresponding layer protocol and resolves recovery module, resolve, decode, decompress and handle to accomplish reduction, change (4-6);
(4-6), mate in file suffixes name and the regular file suffix name storehouse after will reducing; If this suffix name does not belong to conventional suffix name storehouse; Then pass through its respective session information of intellectual analysis, file header information and file data information to confirm this document form, add correct suffix name to this document then, the file of the generation of storage reduction at last; Information and respective session information with this document is committed to database simultaneously, changes (4-1).
Wherein, saidly according to server computation core number coarse grid flow number m is set and is meant: the value of coarse grid flow number m equals the number of server computation core; Said according to coarse grid flow number m, it is that n is meant that the refined net flow number that each coarse grid flow is divided into is set: the n value is 2 or 4; File suffixes name storehouse in the said step (4-6) comprises the suffix name of various regular file types such as web page files, audio/video file, document files, binary file.
Because in reduction process, there is frequent Dram operation, so efficient and stability in order to improve system, frequent Dram operation all adopts memory pool (Memory Pool) technology to realize in the whole reduction flow process.
" FIFO " strategy is adopted in session cache file access in the intelligently parsing recovery module, and " FIFO " strategy is also adopted in the session access in the TCP session formation to be restored.
Whole method of reducing adopts the secondary paralleling tactic: the one-level that at first is implemented on the traffic class through the flow shunting walks abreast; In thread amount reduction flow process, having then between the operational module of data passes adopts the multi-threaded architecture of " producer-consumer's model " loose coupling parallel with the secondary that is implemented on the thread level.
The load factor (L) of refined net flow depends on following parameter in the said step (4-6): correspond respectively to the load factor (being α V4, α V6) of the session hash table of IPv4 and IPv6 in corresponding computation core utilization rate (P), this refined net flow recombination module, that is: L=P (α V4+ α V6); The load factor of hash table is defined as: load factor=the insert length of element (the being node) number/hash table in the table.
It is that unit calculates that XOR in said step (1-4) and the step (2-3) adopts 2 bytes; It is that unit calculates that hash calculation in the step (3-2) also adopts 2 bytes: all parameters of participating in XOR are all split into big or small unit such as 2 bytes (16); If the parameter size just is 16 then need not splits that the unit after then all being split carries out XOR and calculates.
Description of drawings:
Fig. 1 is an overall framework sketch map of the present invention;
Fig. 2 is a TCP session recombination module flow chart among Fig. 1;
Fig. 3 is an intelligently parsing recovery module flow chart among Fig. 1.
Embodiment
Referring to Fig. 1, be overall flow of the present invention.
At first to utilizing the traffic capture module to be truncated to raw data packets and to carry out protocol filtering and shunting from backbone network; Because it is four core processors that server CPU is used in experiment; So it is 4 that thick (network) flow number is set, then thick (network) flow number m value among Fig. 1 this moment is 4.At first filter the primitive network flow and extract IP packet (comprising IPv4 and IPv6 packet); Adopt hash algorithm to realize shunting to the IP packet that extracts then; Source MAC (SMAC) and purpose MAC (DMAC) in this algorithm use IP packet do XOR; Then operation result is got surplus result as cryptographic hash to m, concrete XOR takes turns doing XOR to the value of these 16 sizes (unit) then for splitting into source (purpose) MAC 3 values of 16 (2 byte) size.{ 0,1,2,3} corresponds respectively to coarse grid flow 0, coarse grid flow 1, coarse grid flow 2 and coarse grid flow 3 to cryptographic hash span after hash calculation for set; If promptly the cryptographic hash that calculates through hash function of the source MAC of packet (IP packet) and purpose MAC is 1, then this packet will be divided to slightly in (network) flow 1, and other cryptographic hash by that analogy.
Hash function in the above-mentioned flow shunting can guarantee that as parameter all packets of TCP session are all shunted in same coarse grid flow in the primitive network flow with source MAC and purpose MAC field value; Promptly guarantee the integrality of TCP session in the coarse grid flow after the shunting, also guaranteed the feasibility of follow-up flow diverter module and TCP session recombination module; 4 coarse grid flows after the shunting are all carried out the flow shunting second time through the flow diverter module, be about to the coarse grid flow and further split into the refined net flow; At last the refined net flow is carried out ip fragmentation reorganization, TCP session reorganization, and the TCP session after the intelligently parsing reduction reorganization.
In the specific implementation; To having the reality of a large amount of internal memory operations and the defective of existing dynamic memory management in the concrete reduction flow process; The memory management of whole reorganization reduction flow process adopts memory pool (Memory Pool) technology to realize; Not only improve the efficient of flow reduction, and can reduce the generation of memory fragmentation, improve the stability of program.
Referring to Fig. 2, be the flow process of TCP session recombination module in the method for the said network flow recovery of Fig. 1.
The four-tuple that TCP session recombination module is formed with source IP (IPv4 or IPv6), purpose IP (IPv4 or IPv6), source port number and the destination slogan of TCP session is key; And, create TCP session hash (Hash) table respectively corresponding to IPv4 and ipv6 with above-mentioned four calculation of parameter cryptographic hash.The node of session hash table is a TCP session.A TCP session has two chained lists, deposits the packet of this TCP session source end transmission and the packet that destination sends respectively, and the packet that belongs to this session will reconfigure ordering when inserting chained list.
When TCP session recombination module is initial; Start a timer; Every separated set time (giving tacit consent to 5 seconds) goes to scan two TCP session hash tables (corresponding respectively to IPv4 and IPv6); In time remove some and surpass the TCP session that does not have new packet to arrive in preset time (giving tacit consent to 10 seconds), and overtime TCP session all is forwarded to TCP session formation to be restored according to the order of sequence.
Below be the concrete steps of TCP session recombination module:
Step 1: the IP protocol version (IPv4 or IPv6) of the IP packet after the ip fragmentation recombination module is handled is extracted in TCP session reorganization at first in proper order, judges whether the IP packet after the ip fragmentation reorganization is handled is the tcp data bag, if then change step 2; Otherwise abandon this packet and change step 1.
Step 2: extract the four-tuple (key) that source IP (v4/v6), purpose IP (v4/v6), source port number and destination slogan in this tcp data bag are formed; In corresponding TCP session hash table, search the TCP session that whether has identical key through hash calculation; If then obtain this and have the TCP session of identical key and change step 3; Otherwise commentaries on classics step 4.
Step 3: obtain sequence number in this packet TCP packet header, confirm number, field values such as window size, FIN, RST, ACK; The sequence number that obtains above utilizing in the tcp data that in step 2, finds the then stream, confirm number, information such as window size, FIN, RST, ACK carry out TCP reorganization sorting operation: if this packet is from the source end; Then this packet sorting operation of in the end data bag chained list of the source of corresponding TCP session (source), recombinating, otherwise the sorting operation of in purpose (destination) the end data bag chained list of TCP session, recombinating.In the reorganization ordering, upgrade the state of this TCP session,, then change step 5, otherwise change step if the TCP session status becomes end, replacement or overtime
Step 4: because there is not the TCP session of corresponding K ey in (promptly corresponding respectively to IPv4 or IPv6) in corresponding TCP session hash table; So according to the field values such as SYN that comprise in this tcp data bag; Judge whether it is newly arrived TCP session; If then in corresponding TCP session hash table, create TCP session with corresponding K ey; If not, then abandon this packet, change step 1.
Step 5: this TCP session is taken out from corresponding TCP session hash table and transferred in the pending TCP session formation.All pass through TCP session in being listed as in pending TCP session and reconfigure this moment, and data are stitched together in order and promptly obtain the application layer data of this session actual transmissions after the TCP packet header.
Referring to Fig. 3, be the flow chart of intelligently parsing recovery module in the said network flow recovery method of Fig. 1.
At first this intelligently parsing recovery module of initialization comprises refined net flow load threshold value etc. is set.This intelligently parsing recovery module may further comprise the steps:
Step 1: calculate the load factor I of this refined net flow, and relatively with this index and given load threshold: if I greater than load threshold, then this refined net flow load state be heavy duty, the commentaries on classics step 2; If I is smaller or equal to load threshold, then this refined net flow load state is a underloading, changes step 3;
Step 2: order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates, and is stored in the session cache file this session is complete then, simultaneously the buffer memory session number is added one, changes step 1; Session to be restored when being in heavy duty through caching system can effective balanced respective fine network traffics load, thereby realize the load balancing of this refined net flow.
Step 3: judge whether the buffer memory session number is zero, if change step 4; Otherwise, change step 5;
Step 4: order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates; And session information such as this session time corresponding of intellectual analysis, application layer protocol, session data size; Then this session is submitted to respective application layer protocol processing module and carries out protocol analysis, decoding, decompression; Accomplish reduction and handle, change step 6;
Step 5: the session of buffer memory when the order extraction system is in heavy condition from the session cache file; And the buffer memory session number subtracted one; Session informations such as the corresponding application layer protocol of this session of intellectual analysis, session data size and dialogue-based information are submitted to the application corresponding layer protocol with this session and resolve recovery module and carry out protocol analysis, decoding, decompression then; Accomplish reduction and handle, change step 6;
Step 6: mate in file suffixes name after will reducing and regular file suffix name storehouse; If this suffix name does not belong to conventional suffix name storehouse; Then pass through its respective session information of intellectual analysis, file header information and file data information to confirm this document form, add correct suffix name to this document then, the file that generates after the storage reduction at last; Information and respective session information with this document is committed to database simultaneously, changes step 1.
Above content is to combine concrete preferred implementation to further explain that the present invention did; Can not assert that embodiment of the present invention only limits to this; Those of ordinary skill for technical field under the present invention; Under the prerequisite that does not break away from the present invention's design, can also make some simple deduction or replace, all should be regarded as belonging to the present invention and confirm scope of patent protection by claims of being submitted to.

Claims (3)

1. a network flow recovery method is characterized in that, according to following steps:
This network flow recovery method
(1) through traffic capture modules capture primitive network flow and through MAC Address being done Hash operation original flow is split into a plurality of coarse grid flows; Meanwhile each coarse grid flow is further shunted again, be split into a plurality of refined net flows through the flow diverter module;
Said step (1) is:
0,1 the traffic capture module is provided with coarse grid flow number m according to server computation core number, coarse grid flow numbering correspondingly is set is followed successively by: ..., m-1; May further comprise the steps:
(1-1), this module grasps raw data packets from express network;
(1-2), the packet that grabs is carried out protocol filtering, only keep IPv4 and IPv6 packet;
(1-3), to each IP packet extraction source MAC Address SMAC and two field values of target MAC (Media Access Control) address DMAC;
(1-4), SMAC and DMAC is result behind the XOR m is got surplus, the result who obtains is this IP packet with the coarse grid flow numbering that is divided to, and then this IP packet is diverted to the coarse grid flow with this numbering;
The flow diverter module all carries out flow shunting to each coarse grid flow, and according to coarse grid flow number m, and it is n that the refined net flow number that each coarse grid flow is divided into is set, and refined net flow numbering is followed successively by: 0 ..., n-1; A coarse grid flow shunted may further comprise the steps:
(2-1), from this coarse grid flow, extract packet;
(2-2), each IP packet is extracted two field values of IP address D IP of source IP address SIP and target;
(2-3), the IP address D IP with source IP address SIP and target does XOR and n is got the result surplus; The result who obtains is this IP packet with the refined net flow numbering that is divided to, and then this IP packet is diverted to the refined net flow with identical numbering;
(2) adopt TCP session recombination module to carry out TCP session reorganization to the IP packet of each refined net flow and obtain the TCP session; Upgrade the session status of corresponding session simultaneously; Session status changes into and finishes or when overtime, this session will be by unloading in TCP session formation to be restored;
Said step (2) is:
The four-tuple that TCP session recombination module is formed with source IP address SIP, purpose IP address D IP, source port number and the destination slogan of TCP session is Key; And calculate cryptographic hash through XOR with above-mentioned four parameters, create respectively corresponding to IPv4 with corresponding to the TCP session hash table of IPv6; The node of hash table is a TCP session, and a TCP session comprises two chained lists, deposits the packet of this TCP session source end transmission and the packet that destination sends respectively, and the packet in each chained list is all pressed TCP sequence number and confirmed number arrangement again; May further comprise the steps:
(3-1), the IP packet of TCP session recombination module after obtaining the ip fragmentation recombination module and handling, extract its IP protocol version; Judge whether it is the tcp data bag, if then change step (3-2); Otherwise change step (3-1);
(3-2), extract the four-tuple in the packet, search the TCP session that whether has same four-tuple through hash calculation and in corresponding session hash table, if then obtain the TCP session of this Key and change step (3-3); Otherwise change step (3-4);
(3-3), obtain FIN in this packet TCP packet header, RST, ACK, sequence number, affirmation number, window size field value; Utilize FIN, RST, ACK, sequence number, affirmation number of above-mentioned acquisition, the window size field value sorting operation of recombinating in the TCP session that in step (3-2), finds this packet then; And upgrade the state of this TCP session; If state becomes end, replacement or overtime, then change step (3-5), otherwise change step (3-1);
(3-4), owing to the TCP session that in corresponding session hash table, does not have same Key, so in this session hash table, create hash table node, commentaries on classics step (3-1) with corresponding K ey;
(3-5), the TCP session is taken out from the session hash table corresponding with it and transferred in the pending TCP session formation;
(3-6), in the process of TCP reorganization, start a timer; Whenever the time that perhaps set up on their own at a distance from 5 seconds goes to scan two TCP session hash tables that correspond respectively to IPv4 and IPv6; Do not have the updated TCP session to remove in the time that surpassed 10 seconds or set up on their own, and the session that is eliminated is transferred in the pending TCP session formation;
(3) adopt the intelligently parsing recovery module to extract the session in the TCP session formation to be restored according to the order of sequence; Revert to the application layer file to this session through protocol analysis, decoding and decompression operation; This application layer fileinfo and respective session information are committed to database, and the load balancing of this refined net flow is realized on the ground of intelligently parsing recovery module automation simultaneously;
Said step (3) is:
(4-1), calculate the load factor L of this refined net flow, and relatively with this index and given load threshold: if L greater than load threshold, then this refined net flow load state be heavy duty, commentaries on classics (4-2); If I is smaller or equal to load threshold, then this refined net flow load state is a underloading, changes (4-3);
(4-2), order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates, be stored in the session cache file this session is complete then, simultaneously the buffer memory session number added one, change (4-1);
(4-3), judge whether the buffer memory session number is zero, if change (4-4); Otherwise, change (4-5);
(4-4), order is extracted the TCP session from the pending TCP session formation that TCP session recombination module generates; And this session time corresponding of intellectual analysis, application layer protocol, session data size; Then this session is submitted to respective application layer protocol processing module; Resolve, decode, decompress and handle, change (4-6) to accomplish reduction;
The session of buffer memory when (4-5), the order extraction system is in heavy condition from the session cache file; And the buffer memory session number subtracted one; Application layer protocol, the session data of this session correspondence of intellectual analysis are big or small then; And according to session information this session is submitted to the application corresponding layer protocol and resolves recovery module, resolve, decode, decompress and handle to accomplish reduction, change (4-6);
(4-6), mate in file suffixes name and the regular file suffix name storehouse after will reducing; If this suffix name does not belong to conventional suffix name storehouse; Then pass through its respective session information of intellectual analysis, file header information and file data information to confirm this document form, add correct suffix name to this document then, the file of the generation of storage reduction at last; Information and respective session information with this document is committed to database simultaneously, changes (4-1).
2. a kind of according to claim 1 network flow recovery method is characterized in that, saidly according to server computation core number coarse grid flow number m is set and is meant: coarse grid flow number purpose value equals the number of server computation core; Said according to coarse grid flow number m, it is that n is meant that the refined net flow number that each coarse grid flow is divided into is set: the n value is 2 or 4.
3. a kind of according to claim 1 network flow recovery method; It is characterized in that; It is that unit calculates that XOR in said step (1-4) and the step (2-3) adopts 2 bytes; It is that unit calculates that hash calculation in the step (3-2) also adopts 2 bytes: all parameters of participating in XORs are all split into the unit of 2 byte-sized, if the parameter size just is 16 then need not splits that the unit after then all being split carries out XOR and calculates.
CN201010112581XA 2010-02-23 2010-02-23 Network flow recovery method Expired - Fee Related CN101795230B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010112581XA CN101795230B (en) 2010-02-23 2010-02-23 Network flow recovery method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010112581XA CN101795230B (en) 2010-02-23 2010-02-23 Network flow recovery method

Publications (2)

Publication Number Publication Date
CN101795230A CN101795230A (en) 2010-08-04
CN101795230B true CN101795230B (en) 2012-05-23

Family

ID=42587659

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010112581XA Expired - Fee Related CN101795230B (en) 2010-02-23 2010-02-23 Network flow recovery method

Country Status (1)

Country Link
CN (1) CN101795230B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601583A (en) * 2015-01-21 2015-05-06 国家计算机网络与信息安全管理中心 Online real-time anonymization system and method for IP stream data

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102761517B (en) * 2011-04-25 2015-06-24 工业和信息化部电信传输研究所 Content reduction method for high-speed network
CN102868628B (en) * 2011-07-06 2016-03-02 阿里巴巴集团控股有限公司 Flow segmentation, device and system
CN102523208A (en) * 2011-12-06 2012-06-27 无锡聚云科技有限公司 Application layer protocol parallel processing method under multi-core architecture
CN103856462B (en) * 2012-12-05 2017-02-15 深圳市快播科技有限公司 Method and system for managing sessions
CN103281213B (en) * 2013-04-18 2016-04-06 西安交通大学 A kind of network traffic content extracts and analyzes search method
CN103209135B (en) * 2013-05-03 2016-03-02 深圳市共进电子股份有限公司 A kind of control method turned based on the http traffic of linux platform
CN103401799A (en) * 2013-07-30 2013-11-20 曙光信息产业(北京)有限公司 Method and device for realizing load balance
CN103916316A (en) * 2014-04-11 2014-07-09 国家计算机网络与信息安全管理中心 Linear speed capturing method of network data packages
CN106034085A (en) * 2015-03-19 2016-10-19 中兴通讯股份有限公司 Load sharing method, transmission device and cascading device,
CN106375118A (en) * 2016-08-31 2017-02-01 哈尔滨工业大学(威海) Multi-view-angle traffic mixed playback method and device
CN106850547A (en) * 2016-12-15 2017-06-13 华北计算技术研究所(中国电子科技集团公司第十五研究所) A kind of data restoration method and system based on http protocol
CN107145801A (en) * 2017-04-26 2017-09-08 浙江远望信息股份有限公司 The confidential document automatic discovering method that a kind of suffix name is distorted
CN107749828A (en) * 2017-10-09 2018-03-02 厦门市美亚柏科信息股份有限公司 IP packet deliveries acquisition method, device, terminal device and storage medium
CN107743102B (en) * 2017-10-31 2020-01-31 北京亚鸿世纪科技发展有限公司 efficient tcp session recombination method
CN108011850B (en) * 2017-12-18 2021-08-17 北京百度网讯科技有限公司 Data packet reassembly method and apparatus, computer device, and readable medium
CN108093048B (en) * 2017-12-19 2021-04-02 北京盖娅互娱网络科技股份有限公司 Method and device for acquiring application interaction data
CN111355689B (en) * 2018-12-21 2022-04-22 金篆信科有限责任公司 Stream data processing method and device
CN109766525A (en) * 2019-01-14 2019-05-17 湖南大学 A kind of sensitive information leakage detection framework of data-driven
CN110430172B (en) * 2019-07-18 2021-08-20 南京茂毓通软件科技有限公司 Internet protocol content restoration system and method based on dynamic session association technology
CN110545271A (en) * 2019-08-28 2019-12-06 北京天融信网络安全技术有限公司 method and system for restoring file
CN110661806B (en) * 2019-09-30 2021-07-30 华南理工大学广州学院 Intelligent substation process bus firewall system
CN110677425B (en) * 2019-09-30 2021-09-21 华南理工大学广州学院 Firewall system matching method for matching GOOSE message
CN111884883A (en) * 2020-07-29 2020-11-03 北京宏达隆和科技有限公司 Quick auditing processing method for service interface
CN112039904A (en) * 2020-09-03 2020-12-04 福州林科斯拉信息技术有限公司 Network traffic analysis and file extraction system and method
CN112583936B (en) * 2020-12-29 2022-09-09 上海阅维科技股份有限公司 Method for recombining transmission conversation flow
CN112995184B (en) * 2021-03-05 2022-07-12 中电积至(海南)信息技术有限公司 Multi-source network flow content complete restoration method and device
CN114629970B (en) * 2022-01-14 2023-07-21 华信咨询设计研究院有限公司 TCP/IP flow reduction method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101068248A (en) * 2007-06-07 2007-11-07 杭州华三通信技术有限公司 Long-distance mirror image method, image source equipment and image destination equipment
CN101330473A (en) * 2007-06-18 2008-12-24 电子科技大学 Method and apparatus for filtrating network rubbish information supported by multiple protocols
CN101488960A (en) * 2009-03-04 2009-07-22 哈尔滨工程大学 Apparatus and method for TCP protocol and data recovery based on parallel processing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101068248A (en) * 2007-06-07 2007-11-07 杭州华三通信技术有限公司 Long-distance mirror image method, image source equipment and image destination equipment
CN101330473A (en) * 2007-06-18 2008-12-24 电子科技大学 Method and apparatus for filtrating network rubbish information supported by multiple protocols
CN101488960A (en) * 2009-03-04 2009-07-22 哈尔滨工程大学 Apparatus and method for TCP protocol and data recovery based on parallel processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104601583A (en) * 2015-01-21 2015-05-06 国家计算机网络与信息安全管理中心 Online real-time anonymization system and method for IP stream data
CN104601583B (en) * 2015-01-21 2017-11-10 国家计算机网络与信息安全管理中心 A kind of online real-time anonymous system and method for IP flow datas

Also Published As

Publication number Publication date
CN101795230A (en) 2010-08-04

Similar Documents

Publication Publication Date Title
CN101795230B (en) Network flow recovery method
CN102045305B (en) Method and system for monitoring and tracking multimedia resource transmission
US9906630B2 (en) Processing data packets in performance enhancing proxy (PEP) environment
CN103281213B (en) A kind of network traffic content extracts and analyzes search method
CN106936667B (en) Host real-time identification method based on application program flow distributed analysis
CN101119246B (en) Data packet sampling statistic method and apparatus
CN100579003C (en) Method and system for preventing TCP attack by utilizing network stream technology
CN108011865B (en) SDN flow tracing method, device and system based on flow watermarking and random sampling
CN102468987B (en) NetFlow characteristic vector extraction method
CN102938764B (en) Application identification processing method and device
CN106850547A (en) A kind of data restoration method and system based on http protocol
CN103685224A (en) A network invasion detection method
CN103618733A (en) Data filtering system and method applied to mobile internet
CN100481812C (en) Flow controlling method based on application and network equipment for making applied flow control
CN108462707A (en) A kind of mobile application recognition methods based on deep learning sequence analysis
CN110300085B (en) Evidence obtaining method, device and system for network attack, statistical cluster and computing cluster
CN105704052A (en) Quantized congestion notification message generation method and apparatus
CN1741473A (en) A network data packet availability deciding method and system
CN103095718B (en) Application layer protocol characteristic extracting method based on Hadoop
CN103200112A (en) Computer network transmission control protocol (TCP) flow control method
CN104023000A (en) Network intrusion detection method
CN101789884A (en) Load balancing method for network intrusion detection
CN104702622B (en) Many-one type intranet and extranet big data one-way transmission communication means
CN101848091B (en) Method and system for processing data search
TWI784938B (en) Message cleaning method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120523

Termination date: 20160223