CN101258479A - Whole-network anomaly diagnosis - Google Patents

Whole-network anomaly diagnosis Download PDF

Info

Publication number
CN101258479A
CN101258479A CNA2006800315024A CN200680031502A CN101258479A CN 101258479 A CN101258479 A CN 101258479A CN A2006800315024 A CNA2006800315024 A CN A2006800315024A CN 200680031502 A CN200680031502 A CN 200680031502A CN 101258479 A CN101258479 A CN 101258479A
Authority
CN
China
Prior art keywords
communication network
network
unusual
network service
pattern
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2006800315024A
Other languages
Chinese (zh)
Inventor
M·克罗韦拉
A·拉克希纳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOSCHTON UNIV BOARDOF DIRECTORS
Boston University
Original Assignee
BOSCHTON UNIV BOARDOF DIRECTORS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOSCHTON UNIV BOARDOF DIRECTORS filed Critical BOSCHTON UNIV BOARDOF DIRECTORS
Publication of CN101258479A publication Critical patent/CN101258479A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The present invention relates to methods and apparatus for detecting, monitoring, or analyzing an unusual network event or a network anomaly in a communication network and the business of so doing for the benefit of others . Embodiments of the present invention can detect, monitor, or analyze the network anomaly by applying many statistical and mathematical methods . Embodiments of the present invention include both methods and apparatus to detect, monitor, or analyze the network anomaly. These include classification and localization.

Description

Whole-network anomaly diagnosis
Cross reference to related application
The application requires the rights and interests of the U.S. Provisional Patent Application submitted on June 29th, 2005 number 60/694,853 and 60/694,840 according to 35 U.S.C. § 119 (e), and its content is incorporated into by reference at this.
The research or the exploitation of federal patronage
Part exploitation of the present invention obtains the support of national natural fund ANI-9986397 and CCR-0325701.
Background technology
Network anomaly is as the anomalous event in the networks that mechanism was concerned about such as network provider, the network user, Virtual network operator or law enforcement agency.Network anomaly may be caused by normal network traffic conditions unintentionally, is for example interrupted causing by Internet resources.Network anomaly also may deliberately be caused by hacker's the malicious attack or the people of destruction network or infringement network performance.
Usually, by a link or route collection data network anomaly is monitored and analyzed from networking component such as network.The collection of these data and other network datas and other networking components are isolated.In other words, find that network anomaly and link level traffic performance are closely related.
Monitoring and another unusual method of phase-split network are with the variation of network anomaly as network traffic.Can detect tangible network anomaly like this, but the method that is based on service traffics can not detect low rate network anomaly (for example worm, port scanning, little interrupt event etc.).
Another monitoring and the unusual method of phase-split network are to formulate the manual method of a rule.With rule match or break the rules and determine whether to have run into network anomaly.Yet rule-based method can not detect do not run into before new unusual.
Many present methods have been at a network components, provide solution at each classification of network anomaly, and are preferred at the solution of many parts of network.
Summary of the invention
The present invention relates to be used for to detect, monitor or analyze unusual network event or communication network network anomaly method and apparatus and carry out these detections, monitoring or analysis for other people interests.Embodiments of the invention can be unusual by using many statistics and mathematical method detection, monitoring or phase-split network.Embodiments of the invention comprise the method and apparatus that detection, monitoring and phase-split network are unusual.These embodiment comprise classification and playback (localization).
The present invention is used for effectively, detects continuously and the current techique of the anomalous event (abnormal conditions) of sorter network.This technology is based on the distribution character of analyzing a plurality of characteristics (address, port etc.) professional in the whole network.The distribution of analyzing traffic performance has two key elements, is used for network anomaly is divided into significant trooping.
The distribution of a plurality of traffic performances (address, port, agreement etc.) of phase-split network business simultaneously.The abnormality detection of utilizing characteristic to distribute is very responsive, and by exposing the great unusual detection of having expanded based on capacity of low rate that can not use based on the method detection of flow.
The characteristic of creating Network distributes to extract about unusual structural information.This unusual structural information is used for and will be categorized as different trooping unusually, and these troop with regard to structure and semanteme is significant.By unmonitored method to classifying unusually, thereby, unusual classification is not needed manual intervention or priori.This no method for monitoring allows the present invention's identification and classify new (unknown before) unusual (for example, new worm).
In addition, the present invention analyzes a plurality of characteristics of whole network data, that is, and and the data of a plurality of resource acquisitions from network.Analyzing whole network makes and can detect unusual in the whole network.The analysis of whole network combined with the characteristic distributional analysis make the present invention's the unusual of whole network that can detect and classify, expanded mainly single resource data is carried out detectability based on the current method of flow analysis.
It is principal character of the present invention that the data of collecting from a plurality of Internet resources (that is, network link, router etc.) are carried out systematic analysis.By utilizing the data of whole network, the present invention can diagnose unusual on a large scale, comprises spreading the unusual of whole network.Diagnosis can be determined the unusual time that occurs, unusual position and Exception Type in network.
Unusually may be derived from from abuse (attack, worm etc.) to being not intended to reasons miscellaneous such as (equipment failure, manual errors etc.).Present technique is not limited to the unusual point of each type and solves.On the contrary, by departing from definite normal behaviour as substance unusually, the invention provides the universal solution of a large amount of anomalous events of diagnosis.
An embodiment describes to form has at least one time series corresponding to the dimension of the communication network service of being handled by network components, and time series is resolved into the some communication network services that exist in these network componentses.
Another embodiment forms has at least one model corresponding to one or more modification of the dimension of the communication network service of being handled by network components, and detects unusual in the communication network service pattern.
Another embodiment finds out the deviation in the communication network service characteristic.
Another embodiment generates at least one distribution of communication network service characteristic, estimate the consistance (entropy) of communication network service characteristic, the consistance thresholding of communication network service characteristic is set, at the consistance of communication network service communication and the communication network service characteristic consistance thresholding that sets not simultaneously, do not determine that the communication network service characteristic is unusual.
Description of drawings
Specifying with accompanying drawing hereinafter is described these and other features of the present invention, wherein:
Fig. 1 illustrates the whole network that is used for the abnormality detection data source among the present invention;
Fig. 2 A-2C illustrates network data according to an aspect of the present invention;
Fig. 2 D-2G is to understanding the useful distribution of the present invention;
Fig. 2 H illustrates the data that obtain according to the present invention, and the proper network communication service is described;
Fig. 2 I illustrates the data that obtain according to the present invention, specification exception network traffic;
Fig. 2 J illustrates the data clustering of two dimensions;
Fig. 3 A-3B illustrates network data according to a further aspect in the invention;
Fig. 3 C illustrates the multi-dimensional matrix that utilizes a plurality of characteristics that the present invention creates;
Fig. 3 D illustrates the matrix content of Fig. 3 C matrix;
Fig. 3 E-3F illustrates the result of use according to consistency metric of the present invention;
Fig. 3 G illustrates the matrix decomposition that the present invention carries out;
Fig. 3 H illustrates the result that troops that the present invention carries out; And
Fig. 3 I illustrates according to abnormal characteristic form of the present invention.
Embodiment
The present invention relates to be used for detecting, monitoring or analyze the method and apparatus of the network anomaly of unusual network event or communication network.Embodiments of the invention illustrate the unusual concrete statistical technique of detection, monitoring or phase-split network, also can use other known technologies.Embodiments of the invention comprise the method and apparatus that detection, monitoring or phase-split network are unusual.Here used the whole network one speech means the substantive part of at least one network when being used as the data aggregation basis, data are significant when abnormality detection and analysis like this.
Fig. 1 shows communication network 100.Communication network 100 has networking component, for example has the node a-m of router, server etc.Path 118 illustrates professional flowing.Networking component connects by network link 102, and characteristics such as the capacity of network link 102, form can similar or remarkable difference.In the particular network, can have more or still less networking component and network link, in fact, in the Internet, this is the reduced representation of reality.
As shown in the figure, network node j is by having constituting than the low level communication network of subnet, LAN (LAN (Local Area Network)), personal computer and mobile agent.This is made of (son) networking component 106 than low level communication network (being shown networking component 104), and the scope of (son) networking component 106, server, router or other devices can be similar or different.Network link 108 is well known in the art.Each subnet assembly is made of similar or different personal computer 120 or mobile agent 122 usually.These assemblies are connected by network link 110, and network link 110 can be wireless or conventional link.
A computing equipment 124 in the network can be used for uploading program 112 of the present invention by medium 114, with the data that realize using among the present invention and/or the data mining of analysis.To carrying out analysis of the present invention from the data 116 of node or the reception of other assemblies, perhaps analyze at the elsewhere such as the processor 120 that send data 116 by path 130.Can be from network access data, because can obtain data.If analyze, then need access authorization by the third party.
Should be noted that top explanation only is the exemplary configurations of communication network 100.Any assembly of communication network 100 can have more or less, and given networking component and given network link can have many more low-level layers.
Fig. 2 A-2C illustrates the method for business 118 grades of monitoring communications network 100.Monitoring needs the data of travel all over network, thereby each node a-m need give the monitoring processor access permission.This collection step lot of data, data are arranged in matrix form usually.One or more gathering machines (comprising one or more treating stations 124) are used to collect.
In the practice of this method, in step 202, the process of beginning formation time sequence.Time series has at least one dimension corresponding to 100 business of the communication network on some networking components, as the network node on the stream 118 in each time interval of some time intervals.In order to illustrate, assembly is called the source.In step 204, the seasonal effect in time series data decomposition is become some communication networks 100 business that exist among some networking component node a-m.Assembly 206 illustrates the mathematical form that resolves into these data behind the matrix 208, represents time series.
Each row of matrix 208 have independent source, and each row is to collect the data of collecting in the time interval of data.Data comprise the information about the following variable of Network, as byte number, packet count, the record number of business.Data comprise the information that is used for carrying professional Internet service provider (IP) on each link and port address, as the PC 110 of each intranodal.Data illustrate some characteristics, for example source IP, source port and purpose IP and destination interface.All these data can obtain in the Network piece.Collect data based on link, promptly based on the basis of source to purpose (OD).
Among Fig. 2 B, at step 212 processing array 208, with by checking that one by one each source grade extracts common in time pattern time interval.Like this, extract normal mode.Remaining pattern is considered as unusual.Normal mode illustrates usually as the circulation of the flow in the regular time of 24 hours (Fig. 2 H).When extracting these types as a whole from data, remaining tables of data reveals approaching distribution at random, but in guess or definite exception peak flow (230 among Fig. 2 I) is arranged.
Time interval ground carries out this operation one by one, and step 214 is used for all time intervals of Ergodic Matrices 208 like this.During each iteration, step 214 determines that the whole set of each time interval source data is still less than (unusually possible) thresholding greater than (normally).Thresholding can preestablish, and also can upgrade according to data mining results in time.
When the data on flows of certain time interval surpasses thresholding, handle and forward Fig. 2 C to, in step 216, to the assessing unusually of each guess, wherein, test some possible websites, in the hope of finding coupling by the hypothesis process.Cause finding the substep of matched position like this, perhaps determine most probable not matching.
It is unusual that flow and the abnormal flow of step 218 by relatively guessing the source regular traffic, analytical procedure 216 find.Obtain unusual byte, grouping or the record count value in place, this source like this.According to this value, step 220 provides abnormal time, position and quantity to authorized user.From step 220, step 222 turns back to step 214 with processing, to assess next time interval.
Fig. 2 D-2G illustrates regular traffic and the flow difference of abnormal traffic on distributing.Fig. 2 D-2E illustrates the regular traffic pattern, is respectively the function of detected grouping, byte (packet content) or stream and port.Fig. 2 F-2G illustrates the abnormal traffic of same data source.Detected cycle behavior and residue (residual) that Fig. 2 H and 2I illustrate regular traffic (2H) have the random nature with spike 230 of guess unusual (2I).
In order to obtain this result, use the dimensional analysis of certain form usually from mathematics.A kind of form that the present invention uses is PCA described below (principal component analysis).
PCA is a kind of coordinate transformation method, and it is mapped to new axle with one group of data point.These axles are called main shaft or principal component.When handling the zero-mean data, if considered variance in the component in front, each principal component has following character, its point in data surplus maximum variance.Like this, first principal component is farthest caught data variance possible on the axle.Then, next principal component is caught the maximum variance along other orthogonal directionss.
PAC is used for link data matrix 208, each row Y is handled.Y must be adjusted so that its column mean is zero.Assurance PCA dimension is caught true variance like this, thereby avoids because average connects the deviation result that the difference in using causes.Y represents the link traffic data that the center is average.
PCA is used for Y obtains one group of m principal component { v i} m i=1.The first principal component v 1Be the vector that points to maximum deviation among the Y:
v 1 = arg max | | v | | = 1 | | Yv | | - - - ( 1 )
Wherein || Yv| 2Be directly proportional with the variance of the data of measuring along v.Proceed iterative processing, determined a preceding k-1 principal component after, k principal component is corresponding to the maximum variance of residual error (residual).Residual error be raw data and be mapped to before poor between the data of k-1 main shaft.Thereby, can be with k principal component v kBe expressed as:
v k = arg max | | v | | = 1 | | ( Y - Σ i = 1 k - 1 Y v i v i T ) v | | - - - ( 2 )
The main application of PCA is the intrinsic dimensionality of data point set.By the amount of variation of checking that each principal component is caught || Yv| 2, whether the most of variability in can specified data is being caught in the space of low dimension.Iff can not ignore, can conclude the r n-dimensional subspace n that in fact is positioned at R by the some set of Y representative so along the variance of a preceding r dimension.
In case after having determined main shaft, data acquisition can be mapped to new axle.Data map is to main shaft i Yv 1-Expression.Can be by divided by || Yv 1-|| with this vector normalization is unit length.Thereby, for each main shaft i,
u i = Yv i | | Yv i | | , i = 1 , . . . , m . - - - ( 3 )
u iBe that size is that the vector of t, structure are quadrature.Following formula is represented all link quantities v 1Weighting produces the one-dimensional transform data.Thereby, vector u iCatching whole whole link traffic time series changed along the common time of main shaft i.Because main shaft is intended to whole variance be contributed u 1Catch all link traffic the strongest common time trend, u2 catches time strong trend, or the like.Set { u i } i 4 = 1 Catch most of variances, thereby, all link traffic time serieses the most remarkable common temporal mode caught.Fig. 2 H and 2I illustrate the strongest principal component u respectively 1The component much smaller with the axle conspicuousness.
Subspace method is by being divided into main shaft two set work, two normal and ANOMALOUS VARIATIONS of gathering in the corresponding business.The space that the normal axis set constitutes is the normal-sub space S, and the space that unusual axle set constitutes is unusual subspace S.Fig. 2 J makes diagram to this.
The Ux projection of data shows the significant abnormal behavior.The unusual network condition of professional " spike " 230 indications, this may be by causing unusually.Subspace method is considered as belonging to unusual subspace with this data projection.
Can the projection of two classes be divided into tonal convergence and unusual set with a plurality of processes.By checking usually and the difference between the projection unusually, simple useful in practice based on the division methods of thresholding.Partition process is checked the projection on each main shaft successively, checks that the maximum distribution of expectation distributes to minimum.In case find above the thresholding projection of (for example, comprising 3 δ skew), the axle of this main shaft and all back distributed to abnormal space with mean value.Thereby the main shaft before all is assigned to the proper space.This process places the normal-sub space with previous principal component.
With institute might the link traffic measurement be divided into subspace S and
Figure A20068003150200183
After, the business on each link is decomposed into normal and unusual part.
This method is used for detecting and determines according to the multivariate processing controls based on the theoretical Traffic Anomaly that obtains of subspace error-detecting.
Traffic Anomaly in the detection link traffic goes on foot at any time link traffic y is divided into normal and unusual part.These parts are simulateding partly and remainder of y.
In detection step based on the subspace, in case constructed S and
Figure A20068003150200191
Can this two sub spaces is actual carries out this division by link traffic is projected.Any time putting link measuring assembly y is broken down into:
y = y ^ + y ~ - - - ( 4 )
Like this
Figure A20068003150200193
Corresponding to simulateding part,
Figure A20068003150200194
Corresponding to surplus lines.Can form by y being projected S
Figure A20068003150200195
Y is projected
Figure A20068003150200196
Form
Figure A20068003150200197
Corresponding to the normal-sub space (v1, v2 ..., vr) principal component set be the row of the matrix P of m * r as size, wherein r represents the number of normal axis k.
Figure A20068003150200198
With
Figure A20068003150200199
For:
y ^ = PP T y = Cy And y ~ = ( 1 - PP T ) y = C ~ y - - - ( 5 )
Matrix C=PP wherein TThe linear operation symbol of the projection of normal-sub space S is carried out in representative, and is same,
Figure A200680031502001912
Project unusual subspace
Figure A200680031502001913
Thereby,
Figure A200680031502001914
Comprise the business that simulated,
Figure A200680031502001915
Comprise surplus lines.Usually, Traffic Anomaly occurring can cause
Figure A200680031502001916
Big variation takes place.
Detect
Figure A200680031502001917
The useful statistical method of ANOMALOUS VARIATIONS be square prediction error (SPE):
SPE = | | y ~ | | 2 = | | C ~ y | | 2 - - - ( 6 )
During situation, Network is normal below taking place:
SPE ≤ δ α 2 - - - ( 7 )
δ wherein α 2The thresholding of representing SPE when putting the letter grade is 1-α.The statistical test Q-statistical representation of remainder vector is:
δ α 2 = φ 1 [ c α 2 φ 2 h 0 2 φ 1 + 1 + φ 2 h 2 ( h 0 - 1 ) φ 1 2 ] 1 h 0 - - - ( 8 )
Wherein
h 0 = 1 - 2 φ 1 φ 3 3 φ 2 2 , And φ i = Σ j = r + 1 m λ j i ; i = 1,2,3 - - - ( 9 )
Wherein λ j be variance that data projection is obtained to j principal component (|| Yv j|| 2), c α is the 1-α percentage point in the standard normal distribution.No matter in the normal-sub space how many principal components are arranged, this result is correct.
In this was provided with, if satisfy the hypothesis that obtains this result, cofidence limit 1-α was corresponding to false alarm rate α.Meet the cofidence limit that this hypothesis of multivariate Gaussian distribution obtains the Q-statistics according to sample vector y.Yet, point out that significantly different with Gaussian distribution even if the bottom of raw data distributes, the variation of Q-statistics is also very little.
In the framework of subspace, Traffic Anomaly is represented the skew of the relative S of state vector y.The specific direction of skew provides the information about anomalous property.Determine that unusual method is to check that in may gathering unusually which can describe the skew of y with respect to the normal-sub space S unusually better.
Might unusual set be Fi, i=1 ..., I}.Select this set as far as possible completely, this is because the confirmable unusual set of this sets definition.
Right explanation is convenient, only considers that one dimension is unusual; That is the business of other each links can be considered as the linear function of a variable in, unusual.It is very intuitively unusually that this method is generalized to multidimensional.
Like this, each unusual F iHas relevant vector theta i, θ iDefine this professional mode of each link adding in network unusually.Suppose θ iBe unit form, unusual F occurring so iThe time, state vector y is expressed as:
y=y *if i (10)
Y wherein *Represent sample vector when unusual (occur this vector the unknown) of regular traffic situation, f iRepresent unusual amplitude.
The unusual F of more given hypothesis i, form the y that estimates by eliminating unusual influence *, its corresponding to from unusual F iDeduct some states of affairs on the relevant link.y *The supposition of preferably estimating obtain unusual F by the distance that on unusual direction, minimizes to the normal-sub space S i:
f i = arg min f i | | y ~ - θ ~ i f i | | - - - ( 11 )
Wherein y ~ = C ~ y And θ ~ i = C ~ θ i . Obtain like this f i = ( θ ~ i T θ ~ i ) - 1 θ ~ i T y ~ .
Thereby, y *Optimum estimate hypothetical anomaly F iBe:
y i * = y - θ i f ^ i
. . . = y - θ i ( θ ~ i T θ ~ i ) θ ~ i T y - - - ( 12 )
. . . = ( I - θ i ( θ ~ i T θ ~ i ) - 1 θ ~ i T C ~ ) y
In order from may gather unusually, to determine preferably hypothesis, select to explain the hypothesis of maximum surplus lines.That is, select to make y i *Arrive
Figure A20068003150200211
The Fi of projection minimum.
Thereby, as summary, determine that algorithm comprises:
1, for the unusual Fi of each hypothesis, i=1 ..., I utilizes formula (1) to calculate y i *
2, select j = arg min i | | C ~ y ~ i * | | As unusual Fj.
Possible is unusually Fi, i=1 ..., n}, wherein n is the OD number that flows through network.Like this, each increases the business of (or minimizing) same amount unusually to each link of its influence.Like this, the θ i row i that is defined as route matrix A is normalized to unit form: θ i=Ai/||Ai||_.
Utilize the estimation of the unusual Fi of particular flow rate, can estimate to constitute this unusual byte number.The abnormal traffic amount that produces owing to the unusual Fi that selects on each link is estimated as:
y ′ = y + y i * - - - ( 13 )
So, the estimation summation and the θ of extra traffic i TY ' is directly proportional.Because this extra Business Stream is crossed a plurality of links, must carry out normalization with the number of links of anomalous effects.
In this example, close definition with the OD adfluxion unusually, our quantification depends on A.We use A to represent normalized route matrix, and like this, each class of A has the unit summation, just:
A ‾ i = A i Σ A i - - - ( 14 )
Given then definite unusual Fi, our quantitative estimation is:
A i Ty′ (15)
Some may be in the normal-sub space S unusually fully, thereby can not detect with subspace method.In form, if to some unusual Fi C ~ θ i = 0 This situation may appear.In fact, this is very impossible, because this needs unusual and the normal-sub space S is aimed at fully.Yet, unusual θ iAnd the relation between the normal-sub space can make the more difficult unusually detection on certain other direction of big or small anomaly ratio on the direction.
Above-described principle is used for another aspect of the present invention and distributes with many characteristics (multichannel), the multi-source (multivariate) that produces traffic data.Process is collected the data of a plurality of characteristics of whole network multiple source from the step 310 of Fig. 3 A.In step 340, organize data into the three-dimensional matrice form, Fig. 3 C shows an example.Here, form a series of matrixes 332, matrix of each characteristic.Matrix form have source, OD to or some links of occurring, each row that is used for respect to the time interval number, one is used for each row.In example of the present invention, characteristic is: source IP, source port, purpose IP and destination interface.Also can use other or characteristic still less.
Fig. 3 D illustrates the data acquisition of each matrix component, means that matrix 332 reality are three-dimensional matrice, and each matrix position has a series of data points 334.The matrix of source with respect to the time interval data is shown above like this, it is similar to.
In the step 344 of Fig. 3 A, statistically reduce these data with the process of the feature of representing each characteristic to distribute, in this example, consistency metric provides the result as state 346.Three-dimensional matrice 336 set shown in Fig. 3 G that this obtains discussing below.Consistency metric uses following formula to handle each data:
H ( x ) = Σ i = 1 N ( ni s ) log 2 ( ni s ) - - - ( 16 )
Here, n appears in i iInferior, S is the sum of observing in the matrix.The new matrix 336 of Fig. 3 G is in new state 360, also has three-dimensional character.
Fig. 3 E and 3F illustrate the process of utilizing consistency metric that two different distributions are added up simplification, high consistance shown in it and low consistance figure.When distribution histogram disperses (Fig. 3 E), histogrammic consistance overview is high.When histogram tilted shown in Fig. 3 F or concentrates on some value, histogrammic consistance value was low.
In step 338 subsequently, matrix 336 " decomposition " is become big two-dimensional matrix, wherein, by characteristic is continuous, the row 348 the same length shown in the capable composition diagram 3G of each matrix 336 are gone.Among Fig. 3 G, characteristic is the exemplary forms of source IP address and port and purpose IP address and port.
Then, utilize the subspace clustering technique of describing before to handle in step 350 and 352 pairs of matrixes 342.This is an iterative process, repeats the process of step 352, by step 370 and 380 circulations.The net result of this iteration is described below.
In the unusual assorting process of step 354, each is unusual for detected, finds each residual components of K characteristic.The detected K number destination aggregation (mda) that provides unusually, each is used for a characteristic of matrix 340.K number represented the point in the K dimension space, also is such to its processing in step 356.That is to say that the K number is regarded as in the K space they being drawn in step 358 along the position of K axle.This is depicted in as carrying out in the processor of processor 120 and the Relational database.
Then, use clustering technique to determine trooping in step 360 according to the approximating point of degree of approach threshold value.Directly determine this threshold value from database, and in time it is adjusted, to obtain precise results as the part of study from use.Can carry out this operation of trooping in low dimension space, for example, they be projected the two-dimensional space of Fig. 3 H.
Gained shown in Fig. 3 H troop (362 illustrate the example of dimension K=2) can utilize interpretation of rules, this rule is determined according to the information of manual observation at first, so that a zone is related with the non-conformance description to people close friend.For example, " fall into the conforming unusual worm scan that is of high residue purpose IP consistance and low residue destination interface ".Fig. 3 I illustrates with the present invention and assesses professional real data form and the explanation thereof that obtains of live network.This form illustrates, and for above-described four characteristics, how consistency metric rank (be low ,+for high) defines the cluster location of some Exception Types (a plurality of label) that are interpreted as shown in the figure.This makes the position is added fashionable can the classification unusually as characteristic that in classification, the present invention can carry out classification and playback simultaneously.Other Exception Types that can distinguish comprise: content is distributed, route ring, business procedure, overload.The present invention can use based on the other types beyond the scientific discovery the above-mentioned type of trooping or unknown before this unusual.This provides the learning method of not monitored to be used for determining new network anomaly type.
Like this, order or use the various service providers (for example, service provider network or cable provider) of network of the present invention can take remedial steps to handle unusually, and guarantee this ability to their user.This may make that their service is more attractive.Service provider also can be signed this function to the independent analysis teacher by giving and their visit to necessary network components, creates new commercial opportunity like this.
By operation instruction in service provider network the present invention, but the present invention can be used for the network of other types equally, as traffic highway network, mail service network and sensor network.

Claims (75)

1, a kind of method of monitoring communications Network comprises:
Form a plurality of time serieses, it concentrates the characteristic of the communication network service on a plurality of network componentses of representative; And
Described a plurality of time serieses are resolved at least one the communication network service pattern that exists among at least two of described a plurality of network componentses.
2, the method for monitoring communications Network according to claim 1, wherein said a plurality of time serieses are from data stream.
3, the method for monitoring communications Network according to claim 1, wherein said a plurality of time serieses form from multiple source.
4, a kind of method of monitoring communications Network comprises:
Form a plurality of time serieses, it concentrates the time response of the communication network service on a plurality of network componentses of representative; And
Described a plurality of time serieses are resolved at least one the communication network service pattern that exists at least one of described a plurality of network componentses by the time.
5, method according to claim 1, wherein said a plurality of time serieses form from the data of the communication network service of a plurality of network sites of correspondence.
6, method according to claim 5, wherein said decomposition comprise carries out principal component analysis to described time series.
7, method according to claim 1 further comprises the construction data model, and it represents the proper communication Network pattern that exists in described a plurality of network components.
8, method according to claim 7, wherein said structure comprise that the low dimension that forms the proper communication Network is approximate.
9, method according to claim 8, wherein said structure further comprise from described time series extracts at least one pattern of representing a communication network service information part.
10, method according to claim 8, wherein said structure further comprise from described time series to be extracted as at least one wherein a part of pattern of the representative of coherence measurement.
11,, further comprise and detecting unusually according to any described method of claim 1-10.
12, method according to claim 11, wherein said detection comprises the data stream of estimating described communication network service unusually.
13, method according to claim 11, wherein said detection comprise calculates at least one grouping.
14, method according to claim 13, wherein said detection comprises the processing packet content.
15, method according to claim 11, wherein said detection comprise the described unusual type of detection.
16, method according to claim 11, wherein said detection comprise identification all the other communication network services except that the proper network business.
17, method according to claim 16, wherein analysis comprises when definite described all the other communication network services surpass the statistics thresholding.
18, method according to claim 17 is wherein saidly determined to comprise described unusual the playback in one or more network componentses of described anomalous effects.
19, method according to claim 18, wherein said playback comprise described hypothesis unusual and at least one study are compared to find coupling.
20, according to any described method of claim 1-19, further comprise the deviation that finds described communication network service characteristic.
21, method according to claim 20, the wherein said deviation that finds comprises that the use consistance is summarized as a number with the traffic performance distribution.
22, method according to claim 20 also comprises the described characteristic deviation of playback.
23, according to any described method of claim 1-22, wherein said decomposition comprises troops.
24, a kind of method of analysis of communications networks business comprises:
One or more data types according to one or more sources form model, and described model has at least one dimension corresponding to the communication network service on a plurality of network componentses; And
Detection is unusual as the pattern in the communication network service model.
25, method according to claim 24, the abnormal patterns of wherein said detection are that expression comprises that DoS attack, worm scan, port scanning, quickflashing group, content are distributed, the pattern of one or more the unusual characteristics in the group of high traffic transfer, overload, interrupt event, route ring, traffic engineered formation.
26, method according to claim 24, wherein said pattern are unknown before patterns.
27, method according to claim 24, wherein said detection comprises the data stream of estimating described communication network service.
28, method according to claim 24, wherein said detection comprise calculates at least one grouping.
29, method according to claim 28, wherein said detection comprises the processing packet content.
30, method according to claim 24, wherein said detection comprise the described unusual type of detection.
31, method according to claim 24, wherein said detection comprise identification all the other communication network services except that the proper network business.
32, method according to claim 31 is wherein analyzed and is comprised and determine when described all the other or described proper communication Network surpass the statistics thresholding.
33, method according to claim 32, wherein said determining step comprise described unusual the playback in one or more network componentses of described anomalous effects.
34, method according to claim 33, wherein said playback comprise unusual predeterminedly to be supposed to compare to find coupling with at least one with described.
35, according to any described method of claim 24-34, further comprise the deviation that finds described communication network service characteristic.
36, method according to claim 35, the wherein said deviation that finds comprises that the use consistance is summarized as a number with distribution.
37, method according to claim 35 also comprises described characteristic deviation is playbacked in one or more network componentses of described anomalous effects.
38, according to any described method of claim 24-37, wherein said decomposition comprises troops.
39, a kind ofly be used for detecting the unusual method of communication network, comprise:
Find the deviation in the pattern of determining according to a plurality of characteristics of communication network service.
40, according to the described method of claim 39, wherein said model deviation is the pattern of expression one or more unusual characteristics in following group: DoS attack, worm scan, port scanning, quickflashing group, content are distributed, high traffic transfer, overload, interrupt event, route ring, traffic engineered.
41, according to the described method of claim 40, the unknown is unusual before the wherein said pattern representative.
42, according to the described method of claim 39, the wherein said deviation that finds deviation to comprise to find with the feature collection of the characteristic that comprises source address, destination address, application port and protocol information.
43,, wherein find deviation to comprise the data stream of estimating described communication network service according to the described method of claim 39.
44,, wherein find deviation to comprise and calculate at least one grouping according to the described method of claim 39.
45,, wherein find deviation to comprise the processing packet content according to the described method of claim 44.
46,, wherein find deviation to comprise and detect described unusual type according to the described method of claim 39.
47,, also comprise described deviation playback to affected parts according to the described method of claim 39.
48,, wherein saidly find deviation to comprise to troop according to any described method of claim 39-47.
49, a kind of method of monitoring communications Network comprises:
Generating at least one communication network service characteristic distributes;
Estimate described Uniformity of Distribution;
When the consistance of described communication network service characteristic and described conforming thresholding not simultaneously, it is unusual specifying described communication network service characteristic.
50,, also comprise according to the historical described conforming thresholding of determining described distribution of Network according to the described method of claim 49.
51, according to the described method of claim 49, wherein said network traffic characteristics is selected from and comprises following one or more group: source IP, purpose IP, source port, destination interface and communication protocol.
52,, estimate that wherein consistance also comprises the communication network service characteristic Uniformity of Distribution of estimating communication network one single parts according to the described method of claim 49.
53,, estimate that wherein consistance comprises that also the described consistance of identification is high or low in characteristic distributes according to the described method of claim 49.
54,, estimate that wherein the consistance of described communication network service characteristic also comprises matrix operation according to the described method of claim 49.
55,, estimate wherein that consistance also comprises at a time interval and estimate described Uniformity of Distribution according to the described method of claim 49.
56, a kind of method for detecting abnormality comprises:
Formation is corresponding to the network traffic model of polytype and multiple source;
Extract pattern from described model;
The thresholding of setting in the described pattern is unusual to discern.
57, a kind of method of network traffic characterization comprises the steps:
Data about a plurality of characteristics of network traffic are trooped; And
Then form model, to discern the regular traffic situation of described communication network.
58, a kind of method of network traffic characterization comprises the steps:
Data about a plurality of characteristics of network traffic are trooped; And
Then form model, to discern the abnormal traffic situation of described network service.
59, provided by it in the network of communication by one or more service providers, a kind of service method that guarantees the client of provider comprises the steps:
Any described method in carrying out according to Claim 8-58; And
The unusual of described detection playbacks.
60,, also comprise the step of acquisition to the visit of the data on the described network according to the described method of claim 59.
61,, determine thresholding according to the pattern of determining from a plurality of characteristics of communication network service according to the described method of claim 39.
62, a kind of device is encoded to it with processing instruction, to carry out according to any described method of claim 1-61.
63,, carry out on the wherein said one or more parts that are coded in medium according to the described device of claim 62.
64, according to the described device of claim 62, wherein said being coded in one or more processors carried out, and the processor of wherein said one or more codings like this is used to carry out described method.
65, a kind of device that is used for the monitoring communications Network comprises:
Be used to form a plurality of seasonal effect in time series devices, described a plurality of time serieses are concentrated the communication network service characteristic on a plurality of network componentses of representative; And
Be used for described a plurality of time serieses are resolved into the device of at least two at least one communication network service patterns that exists of described a plurality of network componentses.
66, according to the described device that is used for the monitoring communications Network of claim 65, wherein said a plurality of time serieses are from data stream.
67, according to the described device that is used for the monitoring communications Network of claim 65, wherein said a plurality of time serieses form from multiple source.
68, a kind of device that is used for the monitoring communications Network comprises:
Be used to form a plurality of seasonal effect in time series devices, described a plurality of time serieses are concentrated the time response of the communication network service on a plurality of network componentses of representative; And
Described a plurality of time serieses are resolved into the device of at least one the communication network service pattern that exists at least one of described a plurality of network componentses by the time.
69, a kind of device that is used for the analysis of communications networks business comprises:
Be used for forming according to one or more data types in one or more sources the device of model, described model has at least one dimension corresponding to the communication network service on a plurality of network componentses; And
Be used for detecting unusual device as the pattern of described communication network service model.
70, a kind ofly be used for detecting the unusual device of communication network, comprise:
Be used for finding device according to the deviation of the definite pattern of a plurality of characteristics of communication network service.
71, a kind of device that is used for the monitoring communications Network comprises:
Be used to generate the device that at least one communication network service characteristic distributes;
Be used to estimate the device of described Uniformity of Distribution;
Be used for when the consistance of described communication network service characteristic and described consistance thresholding not simultaneously, specifying described communication network service characteristic is unusual device.
72, a kind of device that is used for abnormality detection comprises:
Be used to form device corresponding to the model of the network traffic of polytype and multiple source;
The device that is used for the pattern of extracting from described model;
Be used for setting the thresholding of described pattern to discern unusual device.
73, a kind of device that is used for the network traffic characterization comprises:
Be used for the device of trooping to about the data of a plurality of characteristics of network traffic; And
Be used for then forming the device of model with the regular traffic situation of discerning described communication network.
74, a kind of method that is used for the monitoring communications Network comprises:
Generate at least one communication network service characteristic and distribute, described communication network service characteristic is selected from the group that comprises source IP, source port, purpose IP and destination interface;
Estimate described characteristic Uniformity of Distribution;
Next freely or a plurality of type identification Exception Types according to described consistance:
Type H(srcIP) H(srcPort) H(dstIP) H(dstPort) Alpha - 0 - -
Network sweep 0 + 0 0 Port scanning - + - + Port scanning 0 - 0 + Alpha 0 0 + 0 Interrupt 0 0 0 + Alpha - 0 - 0 The point multiple spot 0 0 0 - The quickflashing group 0 0 0 - Alpha 0 - 0 0
75, a kind of device that is used for the monitoring communications Network comprises:
Be used to generate the device that at least one communication network service characteristic distributes, described communication network service characteristic is selected from the group that comprises source IP, source port, purpose IP and destination interface;
Be used to estimate the device of described characteristic Uniformity of Distribution;
The device that is used for or a plurality of type identification Exception Types next freely according to described consistance:
Type H(srcIP) H(srcPort) H(dstIP) H(dstPort) Alpha - 0 - - Network sweep 0 + 0 0 Port scanning - + - + Port scanning 0 - 0 + Alpha 0 0 + 0 Interrupt 0 0 0 + Alpha - 0 - 0 The point multiple spot 0 0 0 - The quickflashing group 0 0 0 - Alpha 0 - 0 0
CNA2006800315024A 2005-06-29 2006-06-29 Whole-network anomaly diagnosis Pending CN101258479A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US69485305P 2005-06-29 2005-06-29
US60/694,853 2005-06-29
US60/694,840 2005-06-29

Publications (1)

Publication Number Publication Date
CN101258479A true CN101258479A (en) 2008-09-03

Family

ID=39892276

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2006800315024A Pending CN101258479A (en) 2005-06-29 2006-06-29 Whole-network anomaly diagnosis

Country Status (1)

Country Link
CN (1) CN101258479A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105027135A (en) * 2013-03-29 2015-11-04 英特尔公司 Distributed traffic pattern analysis and entropy prediction for detecting malware in a network environment
CN117978612A (en) * 2024-03-28 2024-05-03 成都格理特电子技术有限公司 Network fault detection method, storage medium and electronic equipment
CN117978612B (en) * 2024-03-28 2024-06-04 成都格理特电子技术有限公司 Network fault detection method, storage medium and electronic equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105027135A (en) * 2013-03-29 2015-11-04 英特尔公司 Distributed traffic pattern analysis and entropy prediction for detecting malware in a network environment
US10027695B2 (en) 2013-03-29 2018-07-17 Intel Corporation Distributed traffic pattern analysis and entropy prediction for detecting malware in a network environment
CN105027135B (en) * 2013-03-29 2018-09-18 英特尔公司 Distributed traffic pattern analysis for detecting Malware in a network environment and entropy prediction
CN117978612A (en) * 2024-03-28 2024-05-03 成都格理特电子技术有限公司 Network fault detection method, storage medium and electronic equipment
CN117978612B (en) * 2024-03-28 2024-06-04 成都格理特电子技术有限公司 Network fault detection method, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US8869276B2 (en) Method and apparatus for whole-network anomaly diagnosis and method to detect and classify network anomalies using traffic feature distributions
CN112651006B (en) Power grid security situation sensing system
Dromard et al. Online and scalable unsupervised network anomaly detection method
Saxena et al. Intrusion detection in KDD99 dataset using SVM-PSO and feature reduction with information gain
KR101538709B1 (en) Anomaly detection system and method for industrial control network
CN106934627B (en) Method and device for detecting cheating behaviors of e-commerce industry
CN101848160B (en) Method for detecting and classifying all-network flow abnormity on line
CN106104556B (en) Log Analysis System
CN102611713B (en) Entropy operation-based network intrusion detection method and device
CN104246786A (en) Field selection for pattern discovery
CN104660464B (en) A kind of network anomaly detection method based on non-extension entropy
US11240119B2 (en) Network operation
CN106453392A (en) Whole-network abnormal flow identification method based on flow characteristic distribution
CN108255996A (en) Safe log analyzing method based on Apriori algorithm
CN104092588B (en) A kind of exception flow of network detection method combined based on SNMP with NetFlow
Al-Sanjary et al. Comparison and detection analysis of network traffic datasets using K-means clustering algorithm
Liao et al. Visual analysis of large-scale network anomalies
CN113205134A (en) Network security situation prediction method and system
CN114374626A (en) Router performance detection method under 5G network condition
CN109150845A (en) Monitor the method and system of terminal flow
Paul et al. Characterizing performance inequity across us ookla speedtest users
CN111490991B (en) Multiple server connection request system and method based on communication equipment
CN113765850B (en) Internet of things abnormality detection method and device, computing equipment and computer storage medium
CN110881022A (en) Large-scale network security situation detection and analysis method
CN110912933B (en) Equipment identification method based on passive measurement

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20080903