CN105516020B - A kind of parallel network flow sorting technique based on ontology knowledge reasoning - Google Patents

A kind of parallel network flow sorting technique based on ontology knowledge reasoning Download PDF

Info

Publication number
CN105516020B
CN105516020B CN201510974162.XA CN201510974162A CN105516020B CN 105516020 B CN105516020 B CN 105516020B CN 201510974162 A CN201510974162 A CN 201510974162A CN 105516020 B CN105516020 B CN 105516020B
Authority
CN
China
Prior art keywords
network flow
network
ontology
flow
decision
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510974162.XA
Other languages
Chinese (zh)
Other versions
CN105516020A (en
Inventor
陶晓玲
韦毅
王勇
孔德艳
亢蕊楠
伍欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yunche Technology Co ltd
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN201510974162.XA priority Critical patent/CN105516020B/en
Publication of CN105516020A publication Critical patent/CN105516020A/en
Application granted granted Critical
Publication of CN105516020B publication Critical patent/CN105516020B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/24Traffic characterised by specific attributes, e.g. priority or QoS
    • H04L47/2441Traffic characterised by specific attributes, e.g. priority or QoS relying on flow classification, e.g. using integrated services [IntServ]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention is a kind of parallel network flow sorting technique based on ontology knowledge reasoning, and step is:I, the network flow training sample set that marked application type is trained using decision Tree algorithms, establishes the Decision-Tree Classifier Model of network flow, and converts it into set of inference rules;II, set of inference rules is configured to by inference machine using Jena kits, by MapReduce parallel computation frames, inference machine is called to carry out parallel knowledge reasoning, excavate the correspondence of network flow example and network application type in network flow ontology, to network flow example markers network application type, net flow assorted is completed.Present invention introduces parallel processing technique MapReduce, using cloud computing as the storage of network flow ontology knowledge reasoning and computing resource, carry out parallelization classification to network flow example, effectively improve classification effectiveness;In conjunction with machine learning and ontology knowledge reasoning, set of inference rules is built, is effectively classified directly against the flow example in network flow ontology.

Description

A kind of parallel network flow sorting technique based on ontology knowledge reasoning
Technical field
The present invention relates to technical field of network management, specially a kind of parallel network flow based on ontology knowledge reasoning point Class method.
Background technology
With the continuous improvement of the fast development and IT application in enterprises demand of Web technologies, many new network application models It comes into being with application demand, thing followed network flow data also shows explosive increase, before being brought to network supervision The challenge not having, but also the demand that user carries out network flow fine-grained management is more and more stronger.As management and it is excellent Change the key technology of disparate networks resource, net flow assorted is widely used in network monitoring, QoS (Quality of Service, service quality) fields such as management, network security, Study on Trend, be it is efficient realize network management, flow control and The important link of safety detection.
Net flow assorted refer in the internet based on ICP/IP protocol, according to network application type (such as WWW, FTP, MAIL, P2P etc.), two-way TCP flow amount or UDP flow amount that network communication generates are classified.
Many researchers have directed attention to the machine learning classification side based on network flow statistic feature in recent years Method, according to the statistical information of certain attributes (such as average packet length, average inter-packet gap time) of flow, using machine learning method Classify to flow, this method is not influenced by dynamic port, payload encryption and network address translation.Network flow point at present The relatively broad machine learning method used of class mainly has:Bayes, neural network, support vector machines and decision tree etc..
The net flow assorted technique study of Cambridge University Moore is mainlyBayes and its improved method are ground Study carefully.Charalampos Rotsos and Moore etc. introduce semi-supervised traffic classification method and train grader, using NB and kernel estimates Two kinds of algorithms of NB model grader, the experimental results showed that this method can obtain higher classification performance than conventional method.But Be such algorithm it is the learning method based on probability statistics, excessively relies on the distribution of sample space, there is potential unstability.
It is effectively eliminated based on port or based on load using the net flow assorted method of feedforward neural network The drawbacks of sorting technique, test verification this method has better stability and robustness compared with NB, in net flow assorted Using with good performance and foreground.But even the extensive BP algorithm of Application of Neural Network, also exposes in the application Many defects such as easily form local minimum and cannot get global optimum, and frequency of training makes learning efficiency low more, convergence rate It is slow etc..
Network flow parameters are obtained from network data packet header, then carry out regular deviation training and zero deflection training comparison Svm classifier algorithm, when handling big-sample data collection, computation complexity is high, and training speed is slow.Network is carried out with SVM decision trees Traffic classification, solving the problems, such as SVM traffic classifications, there are None- identified region and training time are longer.However research still cannot Calculated performance bottleneck problem is thoroughly solved, and this method is a kind of learning method having supervision, cannot find network well New opplication in flow.
WeiLi and Moore extracts 12 in order to avoid the load of detection packet since the network packet network flow Statistical nature, while considering delay and handling capacity, classification accuracy is up to 99.8% under C4.5 decision tree traffic classification methods. Tomasz Bujlow et al. propose a kind of C5.0 machine learning algorithms, are averaged classification accuracy by the experimental verification algorithm Reach 99.3-99.9%.But decision tree lacks retractility, and the additional of increase sorting algorithm is easy when handling large data sets Expense reduces the accuracy of classification.
Under high speed large-scale complex network environment, each sensor network node uses different network flow acquisition systems System collection network data packet, network flow data format differ, semantic, syntactic metacharacter.Therefore the characteristics of current network flow data It is multi-source, isomery, magnanimity, existing net flow assorted technology can only carry out simple format to network flow data mostly Change, lacks the effective workaround to Heterogeneous data (format isomery, syntactic metacharacter, Semantic Heterogeneous), also lack to flow information The description of (such as obtain environment) and knowledge reasoning, the data on flows of acquisition there are inconsistency, cannot share and lack network The problems such as traffic classification knowledge, thus existing traffic classification method is difficult to provide the resource letter needed for network management decisions analysis Breath.
In artificial intelligence field, ontology is gradually applied to integrated knowledge engineering, intelligent information, data mining, magnanimity letter In the fields such as the tissue of breath and processing.Ontology is solves the problems, such as that resource specification, unambiguity and scalability describe to have provided The approach of effect, in terms of describing resource have versatility, opening, intelligent, accuracy and it is comprehensive many advantages, such as.Ontology Also it is used for DSS as a kind of tool of knowledge representation, knowledge reasoning is weight of the ontology in DSS Function is wanted, classification (image classification etc.) problem is also applied to.
Recent study person attempts to introduce ontology to net flow assorted field.Pietrzyk, Marcin attempt shape for the first time Formula defines the classification of stream, and using classical exploitation ontology criterion, iteration builds a category classification tree based on ontology example, It is intended to eliminate the ambiguity that traffic category defines.Chengjie Gu et al. propose a kind of online self-study based on stream profile and ontology Net flow assorted frame is practised, traffic classification is realized by the mapping relations flowed between profile and traffic classes.But current base It can't be applied to large-scale complex network in the net flow assorted method of ontology, ontology is answered net flow assorted field With still belonging to the starting stage.
Cloud computing is data-centered intensive supercomputing technology, is handled large data sets, is analyzed, and to User provides High-effective Service, has the characteristics that parallelization, virtualization, on-demand service.Its parallel processing technique MapReduce can Large-scale data parallel computation process problem for that can divide provides sufficient parallel computation semanteme, widely accepted.Cloud Computing technique is solves the problems, such as that mass data processing provides new method in net flow assorted.Therefore, ontology and cloud computing phase It is conjointly employed in net flow assorted, advantage of each in terms of the description of magnanimity isomeric data is with processing, ontology will be played For the description of network traffic information resource consistency and information management, and cloud computing provides for the structure of ontology and information management Storage and computing resource.
Invention content
The purpose of the present invention is disclosing a kind of parallel network flow sorting technique based on ontology knowledge reasoning, for big rule Network flow example in lay wire network flowmeter body realizes network flow point by the knowledge reasoning of machine learning method and ontology Class.
A kind of parallel network flow sorting technique based on ontology knowledge reasoning that the present invention designs, according to Internet The network flow ontology of the information resource achitecture multilayer of flow collection environment and flow, by every network flow pair in internet A network flow example in network flow ontology is answered, is classified as follows to network flow:
I, it establishes Decision-Tree Classifier Model and generates set of inference rules
Network flow is chosen in internet as sample, the network flow sample of marked application type is as network flow Training sample set is measured, network flow training sample set is trained using decision Tree algorithms, establishes the decision tree classification mould of network flow Type, and Decision-Tree Classifier Model is converted to set of inference rules;
II, parallelization classification is carried out to network flow example by knowledge reasoning
The set of inference rules that step I generates is configured to by corresponding inference machine using Jena kits, to the net built Network flowmeter body calls inference machine to carry out parallel knowledge reasoning, that is, excavates network by MapReduce parallel computation frames The correspondence of network flow example and network application type in flowmeter body carries out network application type to network flow example Label completes net flow assorted.The Jena kits are the kit for ontological construction and its reasoning, are 2004 The open source code semantic net kit based on Java of Hewlett-Packard Corporation's exploitation.
Each step is described in detail below.
The step I specifically includes following sub-step:
I -1, the network flow training sample set of marked application type is trained by decision Tree algorithms, establishes net The Decision-Tree Classifier Model of network flow, the set A={ a1,a2,……,aiIndicate to concentrate i by network flow training sample The set of the statistical characteristics composition of network flow;Set T={ t1,t2,……,tjIndicate by network flow training sample set The set of application type composition belonging to middle j kinds network flow;Set V={ v1,v2,……,vkIndicate to be judged by k decision The set of a reference value composition, it is calculated by each element in set A by decision Tree algorithms, as in decision tree Choose the judgment basis of decision path;
I -2, it is accordingly to be regarded as classification path from root node to the path of each cotyledon in the Decision-Tree Classifier Model of network flow, Using decision determinating reference value as foundation, every classification path in the Decision-Tree Classifier Model of network flow is transformed into " such as Fruit-is then ", i.e. " IF-THEN " structure establishes the network flow classified model of IF-THEN structures;
I -3, the network flow for the IF-THEN structures established using the inference rule syntactic description step I -2 of Jena kits Disaggregated model is measured, and generates set of inference rules.
The step II specifically includes following sub-step:
II -1, the set of inference rules that step I generates is configured to by corresponding inference machine using Jena kits;
II -2, the number of the network flow example described in the performance of each calculate node and network flow ontology According to scale, the network flow ontology built is split, obtains multiple network flow ontology fragments, by network flow sheet Body fragment is uploaded to Hadoop distributed file systems, and is identified to each network flow ontology fragment;
II -3, mapping (Map) function for starting multiple MapReduce, with<Network flow ontology segmental identification accords with, network Flowmeter body fragment>For key-value pair, it is input to mapping function;
II -4, mapping function carries out knowledge reasoning using the inference machine that step II -1 constructs to network flow ontology fragment, Obtain the corresponding network application type label of every network flow example in network flow ontology fragment;
II -5, with<Network application type label, network flow example>For key-value pair, it is output to stipulations function;
II -6, stipulations function merges network flow example according to network application type label, forms sorter network flow Example set;
II -7, sorter network flow example set is exported, net flow assorted is completed.
Compared with prior art, the advantages of a kind of parallel network flow sorting technique based on ontology knowledge reasoning of the present invention For:1, the parallel processing technique MapReduce of large-scale dataset is introduced, therefore cloud computing can be used and know as network flow ontology Storage and the computing resource for knowing reasoning, provide the High-effective Service with parallelization, virtualization, on-demand service to the user; 2, parallelization classification is carried out to network flow example by knowledge reasoning, effectively improves classification effectiveness;It is appropriate to increase calculate node It can accelerate to complete to classify;3, in conjunction with the knowledge reasoning of machine learning method and ontology, by build set of inference rules directly against Network flow example in network flow ontology is effectively classified.
Description of the drawings
Fig. 1 is the general frame based on the parallel network flow sorting technique embodiment of ontology knowledge reasoning;
Fig. 2 is the Organization Chart based on the parallel network flow sorting technique embodiment step II of ontology knowledge reasoning;
Fig. 3 is the parallel network flow sorting technique embodiment stand-alone environment and cluster environment based on ontology knowledge reasoning Lower knowledge reasoning classification time contrast curve;
Fig. 4 is parallel network flow sorting technique embodiment different data scale, the difference based on ontology knowledge reasoning Speed-up ratio curve graph under the cluster environment of node.
Specific implementation mode
Cambridge University mole (Moore) is used based on the parallel network flow sorting technique embodiment of ontology knowledge reasoning It teaches team's acquisition and disclosed data set is used as network traffic information resource, this example referred to as mole data set, used in this example Mole data set includes 377526 network flow samples, and each network flow sample therein is complete biography transport control protocol (TCP) bidirectional traffics are discussed, there are 248 network flow statistic features, it is basic by the source port number of network flow, destination slogan etc. The statistical attributes such as the Mean Time Between Replacement of attribute and packet form, last is labeled as the application type belonging to network flow.
This example chooses mole 12 kinds of network application types of data concentration as class object, 12 kinds of network application types For:WWW (www), game (Games), service (Service), mail (Mail), attack (Attack), database (Database), interaction (Interactive), File Transfer Protocol control (FTP-Control), File Transfer Protocol passively connect Connect (FTP-Pasv), File Transfer Protocol data (FTP-Data), multimedia (Multimedia) and point-to-point (P2P).Choosing altogether It is server end slogan, client to take foundation of 10 network flow statistic features as knowledge reasoning, selected 10 statistical natures In end port numbers, the packet in the same direction being forwarded in the total bytes of contained data, the reserved packet being forwarded contained data total byte It is transmitted in the total number, all reserved packets of contained push (PUSH) flag bit in transmission control protocol packet header in several, all packets in the same direction Transmission control protocol packet header is contained in the contained total number for pushing (PUSH) flag bit in control protocol packet header, all packets in the same direction terminates (FIN) the contained total number for terminating (FIN) flag bit in transmission control protocol packet header in the total number of flag bit, all reserved packets, The total bytes of all initialization packet windows in the same direction, the total bytes of all reserved packet initial windows.
In order to have more objectivity, a mole data set is split into two parts by this example, respectively as the training sample set of this example And test sample collection, it randomly selects 3000 from training sample concentration and is used as training sample, randomly selected from test sample concentration 300000 are used as test sample.
Based on the general frame of the parallel network flow sorting technique embodiment of ontology knowledge reasoning as shown in Figure 1, originally Example is according to a mole network flow ontology for data set structure multilayer, by every network flow in the test sample of mole data set A network flow example in corresponding network flowmeter body, using decision Tree algorithms to the network flow of marked application type Training sample is trained, and establishes the Decision-Tree Classifier Model of network flow, and Decision-Tree Classifier Model is converted to reasoning rule Then collect, set of inference rules is configured to by corresponding inference machine using Jena kits;To the network flow ontology that has built by MapReduce parallel computation frames call inference machine to carry out parallel knowledge reasoning, that is, excavate network flow in network flow ontology The correspondence for measuring example and network application type carries out network application type mark to network flow example, completes network flow Amount classification.
I, it establishes Decision-Tree Classifier Model and generates set of inference rules
I -1, by the included decision Tree algorithms of machine learning and data mining software weka3.7.10 to the instruction of this example Practice sample set to be trained, establish the Decision-Tree Classifier Model of network flow, this example set A indicates that the training sample of this example is concentrated The statistical nature value set of network flow, set A=server end slogan, client end slogan, be forwarded it is in the same direction packet in institute Total bytes containing data, transmission control protocol in the total bytes, all packets in the same direction of contained data in the reserved packet being forwarded Contained push (PUSH) mark in transmission control protocol packet header in the contained total number for pushing (PUSH) flag bit in packet header, all reserved packets It is the contained total number for terminating (FIN) flag bit in transmission control protocol packet header in the total number of will position, all packets in the same direction, all reversed Terminate the total number of (FIN) flag bit, the total byte of all initialization packet windows in the same direction in packet contained by transmission control protocol packet header The total bytes of several, all reserved packet initial windows };Set T indicates that the training sample of this example is concentrated belonging to network flow Application type set, set T=WWW is played, service, mail, attack, database, interaction, File Transfer Protocol control, File Transfer Protocol passively connects, File Transfer Protocol data, and multimedia is point-to-point };Set V={ v1,v2,……,vkTable Show the set being made of k decision determinating reference value, it is calculated by each element in set A by decision Tree algorithms Go out, as the judgment basis for choosing decision path in decision tree.
I -2, it is accordingly to be regarded as classification path from root node to the path of each cotyledon in the Decision-Tree Classifier Model of network flow, Using decision determinating reference value as foundation, every classification path in the Decision-Tree Classifier Model of network flow is transformed into " such as Fruit-is then ", i.e. " IF-THEN " structure establishes the network flow classified model of IF-THEN structures;
I -3, the network flow for the IF-THEN structures established using the inference rule syntactic description step I -2 of Jena kits Disaggregated model is measured, and generates set of inference rules.
II, parallelization classification is carried out to network flow example by knowledge reasoning
The set of inference rules that step I generates is configured to corresponding inference machine by this step using Jena kits, to structure The network flow ontology built up, by MapReduce parallel computation frames, call Jena inference machines to carry out parallel knowledge reasoning, The correspondence for excavating network flow example and network application type in network flow ontology carries out network flow example Network application type mark completes net flow assorted.It specifically includes such as following sub-steps, as shown in Figure 2:
II -1, the set of inference rules that step I generates is configured to by corresponding inference machine using Jena kits;
II -2, the number of the network flow example described in the performance of each calculate node and network flow ontology According to scale, the network flow ontology built is split, obtains multiple network flow ontology fragment (ontologies in Fig. 2 Fragment O1To On), network flow ontology fragment is uploaded to Hadoop distributed file systems, and to each network flow sheet Body fragment is identified;
II -3, mapping (Map) function (Map1 to the Map n in Fig. 2) for starting multiple MapReduce, with<Network flow Ontology segmental identification accords with, network flow ontology fragment>For key-value pair, it is input to mapping function;
II -4, mapping function carries out knowledge reasoning using the inference machine that step II -1 constructs to network flow ontology fragment, Obtain the corresponding network application type label of every network flow example (the type L in Fig. 2 in network flow ontology fragment1It arrives Lm);
II -5, with<Network application type label, network flow example>For key-value pair, it is output to stipulations function;
II -6, stipulations function (Reduce1 to the Reduce m in Fig. 2) merges network flow according to network application type label Example is measured, sorter network flow example set (the flow set C in Fig. 2 is formed1To flow set Cm);
II -7, sorter network flow example set is exported, net flow assorted is completed.
To verify the validity of the method for the present invention, to heterogeneous networks data on flows scale, under stand-alone environment and cluster environment The knowledge reasoning classification time is compared, and comparing result is as shown in Figure 3.Abscissa is network flow instance number in Fig. 3, and unit is Ten thousand;Ordinate is the classification time, and unit is the second.▽ lines indicate that single machine, lines indicate that 2 machines, ◇ lines indicate in Fig. 3 3 machines, △ lines indicate 4 machines.From figure 3, it can be seen that when network flow instance number is less, the calculate node of different numbers Lead time needed for net flow assorted is little.In flow sample number only has 60,000 small-scale classification tasks, single machine ring The classification time needed for border even lower than only opens the cluster environment of 2 nodes, approaches the collection group rings for opening 3 nodes Border.Because when network flow instance data amount is less, the scheduler task of MapReduce and segmentation and recombination data and etc. There is still a need for expend the regular hour.It can thus be appreciated that the processing for small-scale data, can not embody the advantage of the method for the present invention. But with the increase of network flow instance data scale, the gap of the classification spent time of single machine and cluster environment is just increasingly Greatly, the overhead of MapReduce gradually tends towards stability at this time, and the advantage of parallel processing is gradually shown in the method for the present invention Come, embodies the high efficiency of the method for the present invention parallel processing.
In order to more accurately weigh the promotion that the method for the present invention uses the obtained aspect of performance of Parallelizing Techniques, use Speed-up ratio R is as evaluation index:
R=Ts/Tp
Variable T in formulasIndicate the run time of this method under stand-alone environment, variable TpIndicate this method under parallel environment Run time.Fig. 4 gives when cluster environment is using 2,3,4 machines, i.e., when calculate node is respectively 2,3,4, this method Speed-up ratio curve graph.Abscissa is network flow instance number in Fig. 4, and unit is ten thousand;Ordinate is the net flow assorted time Speed-up ratio.▽ lines indicate that 2 machines, lines indicate that 3 machines, ◇ indicate 4 machines in Fig. 4.As shown in figure 4, working as network flow One timing of instance number is measured, with the increase of calculate node, phase step type variation is presented in speed-up ratio;With network flow instance number Increase, speed-up ratio is gradually reduced after increasing to a maximum value, tends towards stability later.By to each node operating status Observation with analysis it is found that when network flow instance number is smaller, the resource utilization of cluster is not high, the resource of each calculate node It is not used effectively;With the increase of network flow example, speed-up ratio is presented nose-up tendency, increases to maximum value, collect at this time The resource utilization of group reaches highest, and the resource of each node can be dispatched well in cluster;With network flow example Number continues growing, and speed-up ratio is gradually reduced, and is then tended to be steady.This is because speed-up ratio reach maximum value when cluster resource profit With bottleneck is had reached, the scheduler of cluster starts to adjust scheduling strategy, is finally reached a stable state.
The experimental results showed that, this method can effectively improve execution efficiency above, and MapReduce concurrent techniques can have Improve the classification effectiveness of network flow example in large-scale network traffic ontology in effect ground.
Above-described embodiment is only further described the purpose of the present invention, technical solution and advantageous effect specific A example, present invention is not limited to this.All any modifications made within the scope of disclosure of the invention, change equivalent replacement Into etc., it is all included in the scope of protection of the present invention.

Claims (3)

1. a kind of parallel network flow sorting technique based on ontology knowledge reasoning, according to Internet flow collection environment and The network flow ontology of the information resource achitecture multilayer of flow, by every network flow corresponding network flowmeter body in internet In a network flow example, classify as follows:
I, it establishes Decision-Tree Classifier Model and generates set of inference rules
Network flow is chosen in internet as sample, the network flow sample of marked application type is instructed as network flow Practice sample set, the network flow training sample set of marked application type is trained using decision Tree algorithms, establishes network flow Decision-Tree Classifier Model, and Decision-Tree Classifier Model is converted to set of inference rules;
II, parallelization classification is carried out to network flow example by knowledge reasoning
The set of inference rules that step I generates is configured to by corresponding inference machine using Jena kits;To the network flow built Ontology is measured, by MapReduce parallel computation frames, calls inference machine to carry out parallel knowledge reasoning, that is, excavates network flow The correspondence of network flow example and network application type in ontology carries out network application type mark to network flow example Note completes net flow assorted.
2. the parallel network flow sorting technique according to claim 1 based on ontology knowledge reasoning, it is characterised in that:
The step I specifically includes following sub-step:
I -1, the network flow training sample set of marked application type is trained by decision Tree algorithms, establishes network flow The Decision-Tree Classifier Model of amount, set A={ a1,a2,……,aiIndicate to concentrate i network flow by network flow training sample Statistical characteristics composition set;Set T={ t1,t2,……,tjIndicate to concentrate j kind networks by network flow training sample The set of application type composition belonging to flow;Set V={ v1,v2,……,vkIndicate to be made of k decision determinating reference value Set, it is calculated by each element in set A by decision Tree algorithms, as choosing decision road in decision tree The judgment basis of diameter;
I -2, it is accordingly to be regarded as classification path from root node to the path of each cotyledon in the Decision-Tree Classifier Model of network flow, with certainly Plan determinating reference value is foundation, every in the Decision-Tree Classifier Model of network flow classification path is transformed into " if- Then ", i.e. " IF-THEN " structure, establishes the network flow classified model of IF-THEN structures;
I -3, the network flow point for the IF-THEN structures established using the inference rule syntactic description step I -2 of Jena kits Class model, and generate set of inference rules.
3. the parallel network flow sorting technique according to claim 1 based on ontology knowledge reasoning, it is characterised in that:
The step II specifically includes following sub-step:
II -1, the set of inference rules that step I generates is configured to by corresponding inference machine using Jena kits;
II -2, the data rule of the network flow example described in the performance of each calculate node and network flow ontology Mould is split the network flow ontology built, obtains multiple network flow ontology fragments, by network flow ontology point Piece is uploaded to Hadoop distributed file systems, and is identified to each network flow ontology fragment;
II -3, the mapping function for starting multiple MapReduce, with<Network flow ontology segmental identification accords with, network flow ontology point Piece>For key-value pair, it is input to mapping function;
II -4, mapping function carries out knowledge reasoning using the inference machine that step II -1 is built to network flow ontology fragment, obtains The corresponding network application type label of every network flow example in network flow ontology fragment;
II -5, with<Network application type label, network flow example>For key-value pair, it is output to stipulations function;
II -6, stipulations function merges network flow example according to network application type label, forms sorter network flow example Collection;
II -7, sorter network flow example set is exported, net flow assorted is completed.
CN201510974162.XA 2015-12-22 2015-12-22 A kind of parallel network flow sorting technique based on ontology knowledge reasoning Active CN105516020B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510974162.XA CN105516020B (en) 2015-12-22 2015-12-22 A kind of parallel network flow sorting technique based on ontology knowledge reasoning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510974162.XA CN105516020B (en) 2015-12-22 2015-12-22 A kind of parallel network flow sorting technique based on ontology knowledge reasoning

Publications (2)

Publication Number Publication Date
CN105516020A CN105516020A (en) 2016-04-20
CN105516020B true CN105516020B (en) 2018-09-11

Family

ID=55723670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510974162.XA Active CN105516020B (en) 2015-12-22 2015-12-22 A kind of parallel network flow sorting technique based on ontology knowledge reasoning

Country Status (1)

Country Link
CN (1) CN105516020B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106878307B (en) * 2017-02-21 2019-10-29 电子科技大学 A kind of unknown communication protocol recognition method based on bit error rate model
CN107800787B (en) * 2017-10-23 2020-10-16 图斯崆南京科技有限公司 Distributed big data real-time exchange sharing computer network system
CN110322037A (en) * 2018-03-28 2019-10-11 普天信息技术有限公司 Method for predicting and device based on inference pattern
US10673765B2 (en) 2018-09-11 2020-06-02 Cisco Technology, Inc. Packet flow classification in spine-leaf networks using machine learning based overlay distributed decision trees
CN109784370B (en) * 2018-12-14 2024-05-10 中国平安财产保险股份有限公司 Decision tree-based data map generation method and device and computer equipment
CN110245874B (en) * 2019-03-27 2024-05-10 中国海洋大学 Decision fusion method based on machine learning and knowledge reasoning
CN111914100A (en) * 2020-08-11 2020-11-10 中科院合肥技术创新工程院 Emergency decision knowledge representation method based on ontology
CN112784990A (en) * 2021-01-22 2021-05-11 支付宝(杭州)信息技术有限公司 Training method of member inference model
CN117313004B (en) * 2023-11-29 2024-03-12 南京邮电大学 QoS flow classification method based on deep learning in Internet of things

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102700A (en) * 2014-07-04 2014-10-15 华南理工大学 Categorizing method oriented to Internet unbalanced application flow
CN104702465A (en) * 2015-02-09 2015-06-10 桂林电子科技大学 Parallel network flow classification method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104102700A (en) * 2014-07-04 2014-10-15 华南理工大学 Categorizing method oriented to Internet unbalanced application flow
CN104702465A (en) * 2015-02-09 2015-06-10 桂林电子科技大学 Parallel network flow classification method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Online Self-learning Internet Traffic Classification based on Profile and Ontology;Chengjie Gu,et al.;《Journal of Convergence Technology》;20110430;第6卷(第4期);第81-91页 *
Toward systematic methods comparison in traffic classification;M.Pietrzyk,et al.;《Wireless Communication and Mobile Computing Conference》;20110812;第1022-1027页 *
网络流量分类方法的比较研究;胡婷 等;《桂林电子科技大学学报》;20100630;第30卷(第3期);第216-219页 *

Also Published As

Publication number Publication date
CN105516020A (en) 2016-04-20

Similar Documents

Publication Publication Date Title
CN105516020B (en) A kind of parallel network flow sorting technique based on ontology knowledge reasoning
WO2021227322A1 (en) Ddos attack detection and defense method for sdn environment
CN108900432B (en) Content perception method based on network flow behavior
Yuan et al. An SVM-based machine learning method for accurate internet traffic classification
US8311956B2 (en) Scalable traffic classifier and classifier training system
CN105591972B (en) A kind of net flow assorted method based on ontology
Alshammari et al. Identification of VoIP encrypted traffic using a machine learning approach
CN109688056B (en) Intelligent network control system and method
Cheng et al. MATEC: A lightweight neural network for online encrypted traffic classification
CN104102700A (en) Categorizing method oriented to Internet unbalanced application flow
CN101252541A (en) Method for establishing network flow classified model and corresponding system thereof
CN107786388A (en) A kind of abnormality detection system based on large scale network flow data
CN101510873A (en) Method for detection of mixed point-to-point flux based on vector machine support
CN110324260A (en) A kind of network function virtualization intelligent dispatching method based on flow identification
Soleymanpour et al. An efficient deep learning method for encrypted traffic classification on the web
CN105577438B (en) A kind of network flow body constructing method based on MapReduce
Tan et al. An Internet Traffic Identification Approach Based on GA and PSO-SVM.
Liao et al. Intelligently modeling, detecting, and scheduling elephant flows in software defined energy cloud: A survey
Shirmarz et al. An autonomic software defined network (SDN) architecture with performance improvement considering
Coelho et al. BACKORDERS: using random forests to detect DDoS attacks in programmable data planes
Kamath et al. Machine learning based flow classification in DCNs using P4 switches
Min et al. Online Internet traffic identification algorithm based on multistage classifier
Chen et al. Online hybrid traffic classifier for Peer-to-Peer systems based on network processors
Li et al. NNSplit-SØREN: Supporting the model implementation of large neural networks in a programmable data plane
CN113850282A (en) Traffic management method, system and device based on dynamic classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20160420

Assignee: Guangxi Jun'an Network Security Technology Co.,Ltd.

Assignor: GUILIN University OF ELECTRONIC TECHNOLOGY

Contract record no.: X2022450000459

Denomination of invention: A parallel network traffic classification method based on ontology knowledge reasoning

Granted publication date: 20180911

License type: Common License

Record date: 20221228

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240409

Address after: 230000 B-1015, wo Yuan Garden, 81 Ganquan Road, Shushan District, Hefei, Anhui.

Patentee after: HEFEI MINGLONG ELECTRONIC TECHNOLOGY Co.,Ltd.

Country or region after: China

Address before: 541004 1 Jinji Road, Qixing District, Guilin, the Guangxi Zhuang Autonomous Region

Patentee before: GUILIN University OF ELECTRONIC TECHNOLOGY

Country or region before: China

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240415

Address after: Room 301, 3rd Floor, Building 3, No. 18 Ziyue Road, Laiguangying Township, Chaoyang District, Beijing, 100020

Patentee after: Beijing Yunche Technology Co.,Ltd.

Country or region after: China

Address before: 230000 B-1015, wo Yuan Garden, 81 Ganquan Road, Shushan District, Hefei, Anhui.

Patentee before: HEFEI MINGLONG ELECTRONIC TECHNOLOGY Co.,Ltd.

Country or region before: China