CN107819646A - A kind of net flow assorted system and method for distributed transmission - Google Patents
A kind of net flow assorted system and method for distributed transmission Download PDFInfo
- Publication number
- CN107819646A CN107819646A CN201710993791.6A CN201710993791A CN107819646A CN 107819646 A CN107819646 A CN 107819646A CN 201710993791 A CN201710993791 A CN 201710993791A CN 107819646 A CN107819646 A CN 107819646A
- Authority
- CN
- China
- Prior art keywords
- module
- data flow
- feature
- flow
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
- H04L43/0876—Network utilisation, e.g. volume of load or congestion level
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/19—Flow control; Congestion control at layers above the network layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2441—Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/24—Traffic characterised by specific attributes, e.g. priority or QoS
- H04L47/2483—Traffic characterised by specific attributes, e.g. priority or QoS involving identification of individual flows
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/31—Flow control; Congestion control by tagging of packets, e.g. using discard eligibility [DE] bits
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/51—Discovery or management thereof, e.g. service location protocol [SLP] or web services
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
The invention discloses a kind of net flow assorted system and method for distributed transmission, and in DPI business identifying systems, flow table detection module, for receiving data flow, whether detection current data stream is labeled;Data flow feature library, the feature of data storage stream;Traffic identification module, checks whether the data flow matches with any one traffic characteristic in data flow feature library;Protocol process module, it is respectively processed respectively according to the difference of classification according to the not isolabeling of data flow;In DFI flux recognition systems, sample acquisition module, by the stream feature extraction for the business that can be identified, it is divided into different classifications;Classifier training module, the sample provided sample acquisition module are trained acquisition disaggregated model;Grader classification prediction module, is classified to the data flow of the None- identified according to the disaggregated model, is sent the data flow to protocol process module using parallel transmission mode after the data flow of point good class is marked.
Description
Technical field
The present invention relates to net flow assorted system and sorting technique, particularly relates to a kind of network traffics of distributed transmission
Categorizing system and method.
Background technology
With the high speed development of network application layer service, how different network data flows is identified by technological means
Amount, so as to control it and manage.The method of identification network data flow business mainly has at present:
Network data flow business identification technology based on port:This identification technology is applied by a variety of
The different port number of registration is identified in IANA (Internet Assigned Numbers Authority).Such as
When to detect port numbers be 80, then it is assumed that the application represents common online and applied.And some illegal applications in current network
Detection and supervision can be hidden by the way of hiding or personation port numbers, cause the data flow of counterfeit legal message to corrode net
Network.For example port used in new P2P agreements is change, therefore the accuracy rate of port numbers identification is more and more lower,
This method has increasingly been not suitable for the identification to existing network data streaming service.
Deep-packet detection (DPI, Deep Packet Inspection) network data flow business identification technology:When meeting certain
, will be helpless using the identification technology based on port when using the new agreement of dynamic port a bit.DPI technologies are except right
Less than 4 layers of Back ground Information also add application layer analysis outside being analyzed, and identify various applications and its content.Exactly pass through
The application layer load characteristic of volume of data bag is analyzed, finds out the tagged word of its application layer, so as to enter to miscellaneous service
Row identification.This method is dealt with when application layer data encryption is run into will be extremely difficult.
Deep stream detects (DFI, Deep Flow Inspection) network data flow business identification technology:When DPI is identified
When technology runs into application layer data encryption, it is difficult to that it is identified by analyzing the feature of application layer data.DFI
Technology is that the technology business to be identified, i.e., different application types are embodied in session connection or data according to the feature of stream
State on stream is had nothing in common with each other.The characteristics of DFI is that the feature of whole data flow is analyzed, such as the average bag each flowed
It is long, time interval that each bag reaches etc..Application layer data need not be detected, thus whether application layer data is encrypted to this
It is not different for kind identification technology.The feature for belonging to the data flow of same kind business is typically all very close, such as
The traffic characteristics of both IM softwares of QQ and MSN may be just very close, thus be the shortcomings that this method can only be to network flow
Several major classes of amount make a distinction.Such as IM, P2P, WEB etc..
However, above-mentioned, the accuracy rate based on port identification technology is low in the prior art, DPI and DFI technologies are respectively present pair
Application layer data encryption business identification it is extremely difficult, and can only to network traffics carry out major class differentiation the defects of.
The content of the invention
In view of this, the present invention propose the net flow assorted system of DPI and the DFI distributed transmission being combined and
Method, the accuracy and processing speed of lifting network traffics identification classification.
Based on a kind of net flow assorted system of above-mentioned purpose distributed transmission provided by the invention, including DPI business
Identifying system and DFI flux recognition systems;
Wherein, in described DPI business identifying systems, including:
Flow table detection module, for receiving data flow, whether detection current data stream is labeled;If so, then send to
Protocol process module;Otherwise, flow table detection module sends the data flow to traffic identification module;
Data flow feature library, the feature for data storage stream;
Traffic identification module, for check the data flow whether with any one traffic characteristic in data flow feature library
Matching, if so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, described in renewal
The state table in flow table detection module;Otherwise, the data flow that will be unable to identification is sent to the classification of DFI flux recognition systems
Device classification prediction module;
Protocol process module, for being respectively processed respectively according to the difference of classification according to the not isolabeling of data flow;
In described DFI flux recognition systems, including:
Grader classification prediction module, for being divided according to the disaggregated model the data flow of the None- identified
Class, the data flow is sent to protocol process module using parallel transmission mode after the data flow of point good class is marked.
As one embodiment, in described DFI flux recognition systems, in addition to:Sample acquisition module, for by DPI
The stream feature extraction for the business that business identifying system can be identified accurately comes out, and is divided into different classifications, as classifier training
The training sample of module;It is additionally operable to after line obtains the sample file of the data flow, the sample file is sent to grader and instructed
Practice module;
Classifier training module, for being trained acquisition disaggregated model to the sample that sample acquisition module provides;
In described DPI business identifying systems, the traffic identification module, it is additionally operable to send the data flow that can be identified
To sample acquisition module.
As one embodiment, DPI business identifying systems described in the system are connected in the network based on ICP/IP protocol.
As one embodiment, include in data flow feature library described in the system and be belonging respectively to multiple network traffics major classes
A variety of business application layer feature.
As one embodiment, flow table detection module safeguards state table described in the system, and information includes in the state table:Source
Ip addresses, purpose ip addresses, source port, destination interface, protocol number;
The traffic identification module is first analyzed network data flow application layer data, and by its application layer feature and number
Be compared according to the feature in stream feature database, if the feature string of application layer data meet one in data flow feature library or
The multiple features of person, then traffic identification module is marked as corresponding agreement ID, and flow renewal is detected into mould to flow table
Block, if the feature with its characteristic character String matching is not present in data flow feature library, data traffic identification module is not to it
It is marked, and is sent to grader classification prediction module, it is further identified by grader classification prediction module.
A kind of net flow assorted method of distributed transmission is additionally provided in another aspect of this invention, applied to depth
The system that bag detection DPI business identifying systems are combined with deep stream detection DFI flux recognition systems, the DPI business identification
System includes flow table detection module, data flow feature library, traffic identification module, protocol process module, the DFI flows identification system
System includes sample acquisition module, classifier training module, grader classification prediction module, and this method comprises the following steps:
Flow table detection module receives data flow, and whether detection current data stream is labeled;If so, then send to agreement
Manage module;Otherwise, flow table detection module sends the data flow to traffic identification module;
Traffic identification module be used for check the data flow whether with any one traffic characteristic in data flow feature library
Matching, if so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, described in renewal
The state table in flow table detection module;Otherwise, the data flow that will be unable to identification is sent to the classification of DFI flux recognition systems
Device classification prediction module;
Grader classification prediction module is classified according to the disaggregated model to the data flow of the None- identified, will be divided
The data flow of good class is sent the data flow to protocol process module using parallel transmission mode after being marked;
Protocol process module is respectively processed according to the difference of classification respectively according to the not isolabeling of data flow.
As one embodiment, also include in the DFI flux recognition systems described in this method:Sample acquisition module, it is used for
The stream feature extraction for the business that DPI business identifying systems can be identified accurately comes out, and is divided into different classifications, as grader
The training sample of training module;And after line obtains the sample file of the data flow, the sample file is sent to grader
Training module;
Classifier training module, for being trained acquisition disaggregated model to the sample that sample acquisition module provides;
In described DPI business identifying systems, the traffic identification module sends the data flow that can be identified to sample
Acquisition module.
As one embodiment, DPI business identifying systems described in this method are connected in the network based on ICP/IP protocol.
As one embodiment, include in data flow feature library described in this method and be belonging respectively to multiple network traffics major classes
A variety of business application layer feature.
As one embodiment, flow table detection module safeguards state table described in this method, and information includes in the state table:Source
Ip addresses, purpose ip addresses, source port, destination interface, protocol number;
The traffic identification module is first analyzed network data flow application layer data, and by its application layer feature and number
Be compared according to the feature in stream feature database, if the feature string of application layer data meet one in data flow feature library or
The multiple features of person, then traffic identification module is marked as corresponding agreement ID, and flow renewal is detected into mould to flow table
Block, if the feature with its characteristic character String matching is not present in data flow feature library, data traffic identification module is not to it
It is marked, and is sent to grader classification prediction module, it is further identified by grader classification prediction module.
From the above it can be seen that the net flow assorted system and method for distributed transmission provided by the invention, first
DPI identifications are carried out to network data, the data flow of DPI None- identifieds is classified by DFI again, is divided in data flow from separator
Class prediction module is transmitted during being sent to protocol process module using parallel distributed, so as to effectively increase to network flow
The accuracy classified is measured, and greatly improves treatment effeciency.
Brief description of the drawings
Fig. 1 is the net flow assorted system structure diagram of distributed transmission of the embodiment of the present invention;
Fig. 2 be distributed transmission of the embodiment of the present invention net flow assorted system in DPI identification modules structured flowchart;
Fig. 3 be embodiment distributed transmission net flow assorted system in DFI identification modules structured flowchart;
Fig. 4 is the flow chart of the net flow assorted method of distributed transmission of the embodiment of the present invention.
Embodiment
For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with specific embodiment, and reference
Accompanying drawing, the present invention is described in more detail.
It is shown in Figure 1, the net flow assorted system for the distributed transmission that DPI and DFI of the invention is combined, by
DPI business identifying systems and DFI flux recognition system two systems are combined into;
Wherein, in described DPI business identifying systems, including:
Flow table detection module 11, for receiving data flow, whether detection current data stream is labeled;If so, then send
To protocol process module;Otherwise, flow table detection module sends the data flow to traffic identification module;
Data flow feature library 12, the feature for data storage stream;
Traffic identification module 13, for check the data flow whether with any one flow spy in data flow feature library
Sign matching, if so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, update institute
State the state table in flow table detection module;Otherwise, the data flow that will be unable to identification is sent to point of DFI flux recognition systems
Class device classification prediction module;
Protocol process module 14, for being located respectively according to the difference of classification respectively according to the not isolabeling of data flow
Reason.
In described DFI flux recognition systems, including:
Sample acquisition module 15, the stream feature extraction of the business for DPI business identifying systems can be identified accurately go out
Come, be divided into different classifications, the training sample as classifier training module;
Classifier training module 16, for being trained acquisition disaggregated model to the sample that sample acquisition module provides;
Grader classification prediction module 17, for being divided according to the disaggregated model the data flow of the None- identified
Class, the data flow is sent to protocol process module using parallel transmission mode after the data flow of point good class is marked.
Wherein, the traffic identification module 13, it is additionally operable to send the data flow that can be identified to sample acquisition module;
The sample acquisition module 15, is additionally operable to after line obtains the sample file of the data flow, and the sample file is sent out
Deliver to classifier training module.
In DPI business identifying systems of the present invention, the data flow feature library, including each major class of network traffics
In partial service application layer feature.Such as:Belonging to the business of instant message this major class has QQ and Baidu HI etc., and QQ's should
It is characterized as that packet is started with 0x02 with layer, is terminated with 0x03, Baidu HI application layer is characterized as that the first eight byte is
0x0000010031564d49.Belonging to the business of P2P this major class has TTlive and Sopcast etc., and TTlive application layer is special
The payload length for first bag for levying each to flow is 52 bytes, and first three byte is 0xffff01, and most latter two byte is
0x0002, Sopcast application layer are characterized as that the tagged word of first packet for having net load is expressed as with regular expression:
^DESCRIBE.*User-Agent:WMPlayer.
The present invention is described in more detail below in conjunction with the accompanying drawings.
As shown in Fig. 2 in the net flow assorted system that the DPI and DFI of the present invention are combined, DPI business identification system
System is connected in the network based on ICP/IP protocol, and which includes flow table detection module, protocol process module, flow identification mould
Block and data flow feature library.
Include a variety of business for being belonging respectively to multiple network traffics major classes in data flow feature library.Citing is such as
Under:
(1) belong to IM (instant messaging) this major class has a QQ and Baidu HI etc., QQ application layer be characterized as packet with
0x02 starts, and is terminated with 0x03, and Baidu HI application layer is characterized as that the first eight byte is 0x0000010031564d49.
(2) belonging to the business of P2P this major class has TTlive and Sopcast etc., and TTlive application layer is characterized as each
The payload length of first bag of stream is 52 bytes, and first three byte is 0xffff01, and most latter two byte is 0x0002,
Sopcast application layer is characterized as that the tagged word of first packet for having net load is expressed as with regular expression:^
DESCRIBE.*User-Agent:WMPlayer.
The feature of above-mentioned all kinds of business is stored with data flow feature library.
Flow table detection module safeguards a state table, and information includes five-tuple (source ip addresses, the purpose ip of data flow in table
Address, source port, destination interface, protocol number) and affiliated protocol type ID, network data flow enter after first by oneself
Five-tuple and state table in information compare, check whether in the state table, used if in the state table belonging to
Protocol process module is sent into after the ID marks of protocol type.
Such as the information format safeguarded in the state table such as row of following table second
Source ip addresses | Ip addresses at present | Source port | Destination interface | Protocol type | Agreement ID |
119.147.18.47 | 10.8.7.43 | 8000 | 4000 | 0x11 | 5 |
Wherein 119.147.18.47 is source ip addresses, and 10.8.7.43 is purpose ip addresses, and 8000 be source port, and 4000 are
Destination interface, 0x11 are protocol number (udp protocols), and 5 be the agreement ID that oneself can be defined, for example we determine QQ agreement ID
For 5, then 5 just represent QQ data flow.Once there is new data stream to enter flow table detection module, first by the five-tuple of oneself with
The first five items (five-tuple) of information in table are compared, if it find that there are the five-tuple of oneself in state table, then should
Data flow is sent into protocol process module after being labeled with agreement ID, if being matched in state table without discovery with oneself five-tuple
Record then enter traffic identification module.
Traffic identification module is first analyzed network data flow application layer data, and by its application layer feature and data flow
Feature in feature database is compared, if the feature string of application layer data meets one in data flow feature library or more
Individual feature, then traffic identification module is marked as corresponding agreement ID, and flow renewal is arrived into flow table detection module, if
The feature with its characteristic character String matching is not present in data flow feature library, then data traffic identification module does not enter rower to it
Note, but the grader classification prediction module of DFI flux recognition systems is sent to, it is entered by grader classification prediction module
Row further identification.
Storage has the application layer tagged word of business identified in advance in data flow feature library, such as bitspirit
20 byte perseverances are 0x13426974546f7272656e742070726f746f636f6c before application layer, are published papers under PP click-throughs
5 byte perseverances are 0x3c00000001 before application layer during part.Traffic identification module be exactly by with aspect ratio in storehouse to judging
Which kind of agreement whether data flow can identify and belong to.
As shown in figure 3, be the structured flowchart of the DFI parts in the net flow assorted system that DPI and DFI are combined, its
In mainly have a sample acquisition module, classifier training module, and grader classification prediction module, sample acquisition module is by Fig. 1
The data flow that can accurately identify of traffic identification module as sample, the several network traffics divided before being classified to it is big
In class, and required stream feature is therefrom extracted, for example QQ is that traffic identification module can accurately identify, and QQ belongs to
IM (instant messaging) this major class, then each QQ network data flows can serve as the sample of IM this major class.Equally
We can also accurately identify to Baidu HI, and Baidu HI falls within this major class of IM, then each Baidu HI network numbers
Can also be as the sample of IM this major class according to stream.We calculate the stream feature of each sample after obtaining sample, such as
The average bag length, the average time interval of bag etc. of the stream, and this sample is marked to determine the major class belonging to it.Using
Same method we can by extracting the sample of this major class of P2P to TTlive and Sopcast network data flows, with
And the sample of other several major classes, all these samples are concentrated in together us and are obtained with a sample file.Its text
Part form such as following table:
Affiliated major class ID | | aspect indexing | I characteristic values | Aspect indexing | Characteristic value | |
1 | 1 | 1000 | 2 | 0.005 | ………… |
2 | 1 | 450 | 2 | 0.03 | ………… |
1 | 1 | 950 | 2 | 0.006 | ………… |
3 | 1 | 100 | 2 | 0.07 | .......... |
.... |
A sample is all represented per a line in this document, the first character of each column represents the major class belonging to the row sample,
Such as we P2P this major class, with 1, this ID is represented, IM (instant messaging) this major class is represented with 2, WEB application this
One major class is represented with 3, then the first row and the third line expression of this file are P2P sample datas, and it is IM that the second row, which represents,
The sample data of (instant messaging), fourth line expression are the sample datas of WEB application.Major class ID of the file per a line is followed by spy
Sign indexes and the value of this feature, such as the average bag of stream is grown this first-class feature with 1 index by we, the average time that bag is reached
2 indexes of interval, then represent the average bag a length of 1000 that the first row indicates that this sample data, wrap the average time of arrival
At intervals of 0.005.More than two of the feature affirmative each flowed, other features are no longer listed here.The effect of sample acquisition module
It is extracted in the data flow that can be exactly accurately identified from traffic identification module and flows feature, by this feature in the form of sample file
Preserve.
Classifier training module obtains a disaggregated model by the training of the sample obtained to sample acquisition module.
Grader classification prediction module is classified by disaggregated model to the flow of traffic identification module None- identified.
As shown in figure 1, online and offline two major classes can be divided into, and flow table detection module, protocol process module, stream
Identification module is measured, data flow feature library, sample acquisition module, grader classification prediction module is online, classifier training mould
Block is offline., it is necessary to first carry out one classification mould of sample acquisition and classifier training generation before online classification is carried out
The process of type, at this time traffic identification module the data flow that can be accurately identified is sent directly into sample acquisition module.
Sample acquisition module can carry out off-line training after obtaining sample file online to grader, obtain classification mould
Type, when the traffic identification module None- identified in DPI business identifying systems, then the grader classification prediction by DFI systems
Module, grader classification prediction module are carried out according to the disaggregated model that training obtains to traffic identification module None- identified data flow
Classification.
In another aspect of this invention, based on said system, a kind of net flow assorted of distributed transmission is additionally provided
Method, the system being combined applied to DPI business identifying systems with DFI flux recognition systems, the DPI business identifying systems bag
Flow table detection module, data flow feature library, traffic identification module, protocol process module are included, the DFI flux recognition systems include
Sample acquisition module, classifier training module, grader classification prediction module.
Comprise the following steps:
(a) data flow first passes through the flow table detection module in DPI business identifying systems, flow table detection module detection current number
Whether according to stream in the state table that flow table detection module is safeguarded, when the data flow is in state table, then flow table detection module is direct
After current data flow label, send to protocol process module;When the data flow is not at state in table, then flow table detection module
The data flow is sent to traffic identification module, into (b) step;
(b) whether traffic identification module checks the data flow containing in the data flow feature library in DPI business identifying systems
Any one feature;When traffic identification module recognizes the flow spy for having matching with the data flow in data flow feature library
Sign, then it is specific data flow to mark the data flow corresponding to current message, updates the state table safeguarded in flow table detection module,
It will be sent simultaneously after current data flow label to protocol process module;When traffic identification module does not have in data flow feature library
The traffic characteristic with the data stream matches is recognized, then is sent the data flow to DFI flux recognition systems, into (c) step;
(c) traffic identification module sends the data flow that can be identified to the sample acquisition mould in DFI flux recognition systems
Block, after sample acquisition module obtains the sample file of the data flow online, the sample file is sent to classifier training mould
Block carries out off-line training, obtains disaggregated model, and this disaggregated model is sent to grader to classify by classifier training module predicts mould
Block;The disaggregated model that grader classification prediction module obtains according to training is to traffic identification module None- identified in (b) step
Data flow is classified;
(d) data flow of point good class is carried out respective markers and sent to protocol process module by grader classification prediction module,
Protocol process module according to, to the not isolabeling of data flow, carrying out specific business or for different major classes respectively in above step
Processing.
The flow chart of the net flow assorted method of distributed transmission shown in Figure 4, as one embodiment, the party
Method comprises the following steps:
Step 1, flow table detection module receives the data flow of input;
Step 2, whether labeled current data stream is detected;If so, then send to protocol process module;Otherwise, flow table
Detection module sends the data flow to traffic identification module;
Step 3, traffic identification module check the data flow whether with any one flow spy in data flow feature library
Sign matching, if so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, into step
Rapid 4;Otherwise, the data flow that will be unable to identification is sent to the grader classification prediction module of DFI flux recognition systems, into step
5;
Step 4, the state table in the flow table detection module, return to step 1 are updated;
Step 5, grader classification prediction module is divided the data flow of the None- identified according to the disaggregated model
Class, the data flow is sent to protocol process module using parallel transmission mode after the data flow of point good class is marked;
Wherein, the data flow that traffic identification module will also be able to identify is sent to sample acquisition module, sample acquisition module
After line obtains the sample file of the data flow, the sample file is sent to classifier training module;
Classifier training module carries out off-line training and obtains disaggregated model, and sends to grader classification prediction module;
Step 6, protocol process module is respectively processed according to the difference of classification respectively according to the not isolabeling of data flow.
The processing procedure of network data when the flow is online classification, its premise are can be seen that from process step above
It is that grader is trained to complete and obtain disaggregated model.
First, when network traffics reach, flow table detection module is arrived first at, the detection of preamble in message is currently reported
Whether text is labeled.If the type of current message corresponding data stream is labeled, mode corresponding with type is used to handle
Current data stream.If the type of current message corresponding data stream does not mark, judgement is identified into traffic identification module,
The foundation of traffic identification module identification is exactly data flow feature library in Fig. 1, the more new stream if traffic identification module can identify
Table detection module, so as to the message that makes to belong to same flow when flow table detects with regard to that can detect.If traffic identification module without
Method identifies, then into grader classification prediction module, the classification mould that grader classification prediction module obtains according to DFI off-line trainings
Type is classified to the flow of None- identified.Because all-network data traffic necessarily belongs to one kind in multiple major classes, so
The flow of all DPI traffic identification module None- identified is all classified by major class herein.Classification is sent after completing
Enter protocol process module, protocol process module is respectively processed according to the difference of classification.Here protocol process module includes
Two big process objects, one is processing to specific business, and another is the processing to network major class.
Network traffics are handled through the above way, and than merely coming comprehensively using DPI or DFI, it can be to application
Layer is accurately identified without the business of encryption, and the differentiation of major class can be also carried out to the business of application layer encryption.
Those of ordinary skills in the art should understand that:The discussion of any of the above embodiment is exemplary only, not
It is intended to imply that the scope of the present disclosure (including claim) is limited to these examples;Under the thinking of the present invention, above example
Or can also be combined between the technical characteristic in different embodiments, step can be realized with random order, and exist such as
Many other changes of upper described different aspect of the invention, for simplicity, they are not provided in details.Therefore, it is all
Within the spirit and principles in the present invention, any omission for being made, modification, equivalent substitution, improvement etc., it should be included in the present invention's
Within protection domain.
Claims (10)
1. the net flow assorted system of a kind of distributed transmission, it is characterised in that flowed including DPI business identifying systems and DFI
Measure identifying system;
Wherein, in described DPI business identifying systems, including:
Flow table detection module, for receiving data flow, whether detection current data stream is labeled;If so, then send to agreement
Processing module;Otherwise, flow table detection module sends the data flow to traffic identification module;
Data flow feature library, the feature for data storage stream;
Traffic identification module, for check the data flow whether with any one traffic characteristic in data flow feature library
Match somebody with somebody, if so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, update the stream
State table in table detection module;Otherwise, the data flow that will be unable to identification is sent to the grader classification of DFI flux recognition systems
Prediction module;
Protocol process module, for being respectively processed respectively according to the difference of classification according to the not isolabeling of data flow;
In described DFI flux recognition systems, including:
Grader classification prediction module, will for being classified according to the disaggregated model to the data flow of the None- identified
The data flow of good class is divided to send the data flow to protocol process module using parallel transmission mode after being marked.
2. system according to claim 1, it is characterised in that in described DFI flux recognition systems, in addition to:Sample
Acquisition module, the stream feature extraction of the business for DPI business identifying systems can be identified accurately come out, and are divided into different classes
Not, the training sample as classifier training module;It is additionally operable to after line obtains the sample file of the data flow, by sample text
Part is sent to classifier training module;
Classifier training module, for being trained acquisition disaggregated model to the sample that sample acquisition module provides;
In described DPI business identifying systems, the traffic identification module, it is additionally operable to send the data flow that can be identified to sample
This acquisition module.
3. system according to claim 1, it is characterised in that the DPI business identifying systems are connected to based on TCP/IP
In the network of agreement.
4. system according to claim 1, it is characterised in that include in the data flow feature library be belonging respectively to it is multiple
The application layer feature of a variety of business of network traffics major class.
5. system according to claim 1, it is characterised in that the flow table detection module safeguards state table, the state table
Middle information includes:Source ip addresses, purpose ip addresses, source port, destination interface, protocol number;
The traffic identification module is first analyzed network data flow application layer data, and by its application layer feature and data flow
Feature in feature database is compared, if the feature string of application layer data meets one in data flow feature library or more
Individual feature, then traffic identification module is marked as corresponding agreement ID, and data flow renewal is arrived into flow table detection module,
If the feature with its characteristic character String matching is not present in data flow feature library, traffic identification module does not enter rower to it
Note, and grader classification prediction module is sent to, it is further identified by grader classification prediction module.
6. a kind of net flow assorted method of distributed transmission, it is characterised in that identified applied to deep-packet detection DPI business
The system that system is combined with deep stream detection DFI flux recognition systems, the DPI business identifying systems include flow table detection mould
Block, data flow feature library, traffic identification module, protocol process module, the DFI flux recognition systems include sample acquisition mould
Block, classifier training module, grader classification prediction module, this method comprise the following steps:
Flow table detection module receives data flow, and whether detection current data stream is labeled;If so, then send to protocol processes mould
Block;Otherwise, flow table detection module sends the data flow to traffic identification module;
Traffic identification module is used to check whether the data flow matches with any one traffic characteristic in data flow feature library,
If so, then marking current data stream to send to the protocol process module according to the traffic characteristic of the matching, the flow table is updated
State table in detection module;Otherwise, the data flow that will be unable to identify, which is sent to the grader of DFI flux recognition systems, classifies in advance
Survey module;
Grader classification prediction module is classified according to the disaggregated model to the data flow of the None- identified, by a point good class
Data flow be marked after the data flow is sent to protocol process module using parallel transmission mode;
Protocol process module is respectively processed according to the difference of classification respectively according to the not isolabeling of data flow.
7. method according to claim 6, it is characterised in that
Also include in described DFI flux recognition systems:Sample acquisition module, for can be accurate by DPI business identifying systems
The stream feature extraction of the business of identification comes out, and is divided into different classifications, the training sample as classifier training module;And
After line obtains the sample file of the data flow, the sample file is sent to classifier training module;
Classifier training module, for being trained acquisition disaggregated model to the sample that sample acquisition module provides;
In described DPI business identifying systems, the traffic identification module sends the data flow that can be identified to sample acquisition
Module.
8. according to the method for claim 6, it is characterised in that the DPI business identifying systems are connected to based on TCP/IP
In the network of agreement.
9. according to the method for claim 6, it is characterised in that include in the data flow feature library be belonging respectively to it is multiple
The application layer feature of a variety of business of network traffics major class.
10. according to the method for claim 6, it is characterised in that the flow table detection module safeguards state table, the state table
Middle information includes:Source ip addresses, purpose ip addresses, source port, destination interface, protocol number;
The traffic identification module is first analyzed network data flow application layer data, and by its application layer feature and data flow
Feature in feature database is compared, if the feature string of application layer data meets one in data flow feature library or more
Individual feature, then traffic identification module is marked as corresponding agreement ID, and data flow renewal is arrived into flow table detection module,
If the feature with its characteristic character String matching is not present in data flow feature library, traffic identification module does not enter rower to it
Note, and grader classification prediction module is sent to, it is further identified by grader classification prediction module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710993791.6A CN107819646A (en) | 2017-10-23 | 2017-10-23 | A kind of net flow assorted system and method for distributed transmission |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710993791.6A CN107819646A (en) | 2017-10-23 | 2017-10-23 | A kind of net flow assorted system and method for distributed transmission |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107819646A true CN107819646A (en) | 2018-03-20 |
Family
ID=61608425
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710993791.6A Pending CN107819646A (en) | 2017-10-23 | 2017-10-23 | A kind of net flow assorted system and method for distributed transmission |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107819646A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108900374A (en) * | 2018-06-22 | 2018-11-27 | 网宿科技股份有限公司 | A kind of data processing method and device applied to DPI equipment |
CN109275045A (en) * | 2018-09-06 | 2019-01-25 | 东南大学 | Mobile terminal encrypted video ad traffic recognition methods based on DFI |
CN109327389A (en) * | 2018-11-13 | 2019-02-12 | 南京中孚信息技术有限公司 | Traffic classification label forwarding method, device and system |
CN109361618A (en) * | 2018-10-11 | 2019-02-19 | 平安科技(深圳)有限公司 | Data traffic labeling method, device, computer equipment and storage medium |
CN111404832A (en) * | 2019-01-02 | 2020-07-10 | ***通信有限公司研究院 | Service classification method and device based on continuous TCP link |
CN111917665A (en) * | 2020-07-23 | 2020-11-10 | 华中科技大学 | Terminal application data stream identification method and system |
CN112350956A (en) * | 2020-10-23 | 2021-02-09 | 新华三大数据技术有限公司 | Network traffic identification method, device, equipment and machine readable storage medium |
CN112383489A (en) * | 2020-11-16 | 2021-02-19 | 中国信息通信研究院 | Network data traffic forwarding method and device |
WO2021104444A1 (en) * | 2019-11-27 | 2021-06-03 | 华为技术有限公司 | Data flow classification method, apparatus and system |
CN113313216A (en) * | 2021-07-30 | 2021-08-27 | 深圳市永达电子信息股份有限公司 | Method and device for extracting main body of network data, electronic equipment and storage medium |
CN114050926A (en) * | 2021-11-09 | 2022-02-15 | 南方电网科学研究院有限责任公司 | Data message depth detection method and device |
CN115174240A (en) * | 2022-07-13 | 2022-10-11 | 中国国家铁路集团有限公司 | Railway encrypted flow monitoring system and method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101605067A (en) * | 2009-04-22 | 2009-12-16 | 网经科技(苏州)有限公司 | Network behavior active analysis diagnostic method |
CN101645806A (en) * | 2009-09-04 | 2010-02-10 | 东南大学 | Network flow classifying system and network flow classifying method combining DPI and DFI |
CN101986609A (en) * | 2009-07-29 | 2011-03-16 | 中兴通讯股份有限公司 | Method and system for realizing network flow cleaning |
CN102984076A (en) * | 2012-12-03 | 2013-03-20 | 中国联合网络通信集团有限公司 | Method and device for identifying flow service types |
-
2017
- 2017-10-23 CN CN201710993791.6A patent/CN107819646A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101605067A (en) * | 2009-04-22 | 2009-12-16 | 网经科技(苏州)有限公司 | Network behavior active analysis diagnostic method |
CN101986609A (en) * | 2009-07-29 | 2011-03-16 | 中兴通讯股份有限公司 | Method and system for realizing network flow cleaning |
CN101645806A (en) * | 2009-09-04 | 2010-02-10 | 东南大学 | Network flow classifying system and network flow classifying method combining DPI and DFI |
CN102984076A (en) * | 2012-12-03 | 2013-03-20 | 中国联合网络通信集团有限公司 | Method and device for identifying flow service types |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108900374A (en) * | 2018-06-22 | 2018-11-27 | 网宿科技股份有限公司 | A kind of data processing method and device applied to DPI equipment |
CN109275045A (en) * | 2018-09-06 | 2019-01-25 | 东南大学 | Mobile terminal encrypted video ad traffic recognition methods based on DFI |
CN109275045B (en) * | 2018-09-06 | 2020-12-25 | 东南大学 | DFI-based mobile terminal encrypted video advertisement traffic identification method |
CN109361618A (en) * | 2018-10-11 | 2019-02-19 | 平安科技(深圳)有限公司 | Data traffic labeling method, device, computer equipment and storage medium |
CN109361618B (en) * | 2018-10-11 | 2022-10-28 | 平安科技(深圳)有限公司 | Data flow marking method and device, computer equipment and storage medium |
CN109327389A (en) * | 2018-11-13 | 2019-02-12 | 南京中孚信息技术有限公司 | Traffic classification label forwarding method, device and system |
CN109327389B (en) * | 2018-11-13 | 2021-06-08 | 南京中孚信息技术有限公司 | Traffic classification label forwarding method, device and system |
CN111404832A (en) * | 2019-01-02 | 2020-07-10 | ***通信有限公司研究院 | Service classification method and device based on continuous TCP link |
WO2021104444A1 (en) * | 2019-11-27 | 2021-06-03 | 华为技术有限公司 | Data flow classification method, apparatus and system |
CN111917665A (en) * | 2020-07-23 | 2020-11-10 | 华中科技大学 | Terminal application data stream identification method and system |
CN112350956B (en) * | 2020-10-23 | 2022-07-01 | 新华三大数据技术有限公司 | Network traffic identification method, device, equipment and machine readable storage medium |
CN112350956A (en) * | 2020-10-23 | 2021-02-09 | 新华三大数据技术有限公司 | Network traffic identification method, device, equipment and machine readable storage medium |
CN112383489A (en) * | 2020-11-16 | 2021-02-19 | 中国信息通信研究院 | Network data traffic forwarding method and device |
CN113313216B (en) * | 2021-07-30 | 2021-11-30 | 深圳市永达电子信息股份有限公司 | Method and device for extracting main body of network data, electronic equipment and storage medium |
CN113313216A (en) * | 2021-07-30 | 2021-08-27 | 深圳市永达电子信息股份有限公司 | Method and device for extracting main body of network data, electronic equipment and storage medium |
CN114050926A (en) * | 2021-11-09 | 2022-02-15 | 南方电网科学研究院有限责任公司 | Data message depth detection method and device |
CN115174240A (en) * | 2022-07-13 | 2022-10-11 | 中国国家铁路集团有限公司 | Railway encrypted flow monitoring system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107819646A (en) | A kind of net flow assorted system and method for distributed transmission | |
CN101741744B (en) | Network flow identification method | |
CN101645806B (en) | Network flow classifying system and network flow classifying method combining DPI and DFI | |
CN111340191B (en) | Bot network malicious traffic classification method and system based on ensemble learning | |
CN109361617B (en) | Convolutional neural network traffic classification method and system based on network packet load | |
EP3469770B1 (en) | Spam classification system based on network flow data | |
CN110796196B (en) | Network traffic classification system and method based on depth discrimination characteristics | |
CN110113345A (en) | A method of the assets based on Internet of Things flow are found automatically | |
CN104320304B (en) | A kind of core network user flow application recognition methods of the multimode fusion easily extended | |
CN104270392A (en) | Method and system for network protocol recognition based on tri-classifier cooperative training learning | |
CN101414939B (en) | Internet application recognition method based on dynamical depth package detection | |
CN109117634A (en) | Malware detection method and system based on network flow multi-view integration | |
CN107465643A (en) | A kind of net flow assorted method of deep learning | |
CN110034966B (en) | Data flow classification method and system based on machine learning | |
CN104468252A (en) | Intelligent network service identification method based on positive transfer learning | |
CN109151880A (en) | Mobile application flow identification method based on multilayer classifier | |
CN111698260A (en) | DNS hijacking detection method and system based on message analysis | |
CN108173705A (en) | First packet recognition methods, device, equipment and the medium of flow drainage | |
Kong et al. | Identification of abnormal network traffic using support vector machine | |
CN109660656A (en) | A kind of intelligent terminal method for identifying application program | |
CN106789416A (en) | The recognition methods of industrial control system specialized protocol and system | |
CN111224998B (en) | Botnet identification method based on extreme learning machine | |
CN114189350A (en) | LightGBM-based train communication network intrusion detection method | |
CN105429817A (en) | Illegal business identification device and illegal business identification method based on DPI and DFI | |
CN101764754A (en) | Sample acquiring method in business identifying system based on DPI and DFI |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180320 |
|
WD01 | Invention patent application deemed withdrawn after publication |