CN103747477B - Network traffic analysis and Forecasting Methodology and device - Google Patents

Network traffic analysis and Forecasting Methodology and device Download PDF

Info

Publication number
CN103747477B
CN103747477B CN201410019136.7A CN201410019136A CN103747477B CN 103747477 B CN103747477 B CN 103747477B CN 201410019136 A CN201410019136 A CN 201410019136A CN 103747477 B CN103747477 B CN 103747477B
Authority
CN
China
Prior art keywords
flow
time sequence
calculation formula
feature
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410019136.7A
Other languages
Chinese (zh)
Other versions
CN103747477A (en
Inventor
杜翠凤
陆蕊
蒋仕宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GCI Science and Technology Co Ltd
Original Assignee
GCI Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GCI Science and Technology Co Ltd filed Critical GCI Science and Technology Co Ltd
Priority to CN201410019136.7A priority Critical patent/CN103747477B/en
Publication of CN103747477A publication Critical patent/CN103747477A/en
Application granted granted Critical
Publication of CN103747477B publication Critical patent/CN103747477B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of network traffic analysis and Forecasting Methodology and device, the global characteristics of the flow-time sequence of each base station to be measured are first extracted;Then clustered according to the global characteristics extracted;Further according to the result clustered, the attributive character of data on flows is gathered;The flow of attributive character and last moment finally according to the data on flows, carries out volume forecasting.The global characteristics of extraction time sequence of the invention, with global characteristics similitude come the similitude of reflecting time sequence, catch the behavioral characteristics that time series is changed over time, obtain more rational result, large-scale time series is described by using a small amount of feature simultaneously, improve the complexity during the robustness for judging analog result, reduction cluster calculation;The various attributive character related to data on flows are gathered according to cluster result, according to flow and the common predicted flow rate data of attributive character, containing much information for prediction correspondingly improves precision of prediction, rational resource distribution is carried out to network.

Description

Network traffic analysis and Forecasting Methodology and device
Technical field
The present invention relates to communication technical field, more particularly to a kind of network traffic analysis and Forecasting Methodology and device.
Background technology
In communication network optimization, network traffic analysis and prediction are very important link, the optimization to Internet resources Configuration is significant.Whether accurate volume forecasting is, the interpretation that predicts the outcome and predicts the outcome and actual flow number According to whether being consistent, investment and the construction scale of network are all directly affected, and be the key of volume forecasting to the preliminary analysis of flow, Directly affect the accuracy of volume forecasting.
Flow is analyzed using original time series in the prior art, using between Euclidean distance measuring period sequence Similitude, then clustered according to this similitude;Meanwhile, usage history data on flows predicts unknown stream during predicted flow rate Data are measured, using traditional Regression Forecast, time series analysis etc..
Existing method only payes attention to the difference of time series value on correspondence time point;Using euclidean distance metric time series Between similitude, so as to cause result to be vulnerable to the influence of value on indivedual time points, lose the robustness of result;Only utilize Data on flows, so as to cause the result poor-performing of prediction.
The content of the invention
Based on above-mentioned situation, the present invention proposes a kind of network traffic analysis and Forecasting Methodology, it is possible to increase precision of prediction, Rational resource distribution is carried out to network.
To achieve these goals, the technical scheme is that:
A kind of network traffic analysis and Forecasting Methodology, comprise the following steps:
Extract the global characteristics of the flow-time sequence of each base station to be measured;
Global characteristics according to being extracted are clustered;
According to the result clustered, the attributive character of data on flows is gathered;
According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out.
For prior art problem, the invention also provides a kind of network traffic analysis and prediction meanss, improve existing stream Amount analysis robustness is poor, the problem of volume forecasting precision is low, is adapted to practical application.
Specific implementation is:A kind of network traffic analysis and prediction meanss, including:
Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured;
Cluster module, for being clustered according to the global characteristics extracted;
Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting.
Compared with prior art, beneficial effects of the present invention are:Inventive network flow analysis and Forecasting Methodology and device, First extract the global characteristics of the flow-time sequence of each base station to be measured;Then clustered according to the global characteristics extracted; Further according to the result clustered, the attributive character of data on flows is gathered;Attributive character finally according to the data on flows and upper The flow at one moment, carries out volume forecasting.After technology using the present invention, the global characteristics of extraction time sequence, with global spy The similitude that similitude carrys out reflecting time sequence is levied, the behavioral characteristics for catching time series to change over time obtain more reasonable Result, while describe large-scale time series by using a small amount of feature, improve the robustness for judging analog result, reduction cluster Complexity in calculating process;The various attributive character related to data on flows are gathered according to cluster result, according to flow and category The property common predicted flow rate data of feature, containing much information for prediction correspondingly improves precision of prediction, network is reasonably provided Source is configured.
Brief description of the drawings
Fig. 1 is network traffic analysis and the schematic flow sheet of Forecasting Methodology in one embodiment;
Fig. 2 is network traffic analysis and the structural representation of prediction meanss in one embodiment.
Embodiment
For the objects, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with drawings and Examples, to this Invention is described in further detail.It should be appreciated that embodiment described herein is only to explain the present invention, Do not limit protection scope of the present invention.
Network traffic analysis and Forecasting Methodology in one embodiment, as shown in figure 1, methods described includes:
Step S101:Extract the global characteristics of the flow-time sequence of each base station to be measured;
Step S102:Global characteristics according to being extracted are clustered;
Step S103:According to the result clustered, the attributive character of data on flows is gathered;
Step S104:According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out.
It is evidenced from the above discussion that, this method improves network traffics according to flow and the common predicted flow rate data of attributive character Precision of prediction, rational resource distribution is carried out to network.
As one embodiment, the global characteristics include tendency feature or seasonal characteristics or kurtosis feature or the degree of bias Feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or multinomial.
As one embodiment, the flow-time sequence is by daily gathering the data on flows of each base station to be measured, even Continuous collection half a year obtains.
As one embodiment, the tendency feature is weighed by Z statistics, and Z statistics are more than zero, become to rise Gesture;Z statistics are less than zero, are downward trend;The calculation formula of Z statistics is:Wherein S is The statistic of Normal Distribution, Var (S) is S variance, and S calculation formula is:Var(S) Calculation formula be:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T are flow-time sequence Length, xjIt is flow-time sequence in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj- xk) calculation formula be:
The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xt FFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be:
The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t= 1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample mark of flow-time sequence It is accurate poor;
The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence Row, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence This standard is poor;
The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, Ljung-Box-Q statistic detection flows Whether time series is white-noise process, and the calculation formula of Ljung-Box-Q statistics is:Its Middle T is the length of flow-time sequence, and p is considered maximum lag order, and τ is delayed issue, rτFor flow-time sequence Auto-correlation coefficient;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
The nonlinear characteristic reflects that BDS test statistics detection flows time serieses are by BDS test statistics No is independent same distribution, for flow-time sequence xt, t=1,2 ... T, at moment s, w observed value is xsAnd xw, then all sights Examine value (xs,xw) by being configured to:
{(xs,xw),(xs+1,xw+1),(xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;BDS statistics Calculation formula be:Wherein r is interval size, and C (N, m, r) is phase Close integration, σ ' (N, m, r) be C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors,
The spectrum signature is the preceding second order coefficient of the DFT extracted, and the extraction of spectrum signature is using discrete Fourier transform coefficient, n levels number is as spectrum signature before can extracting, because the HFS of a signal is unimportant, Therefore before most of energy of domain space is concentrated on several coefficients.
As one embodiment, the tendency characteristic use linear trend method is obtained, and is isolated using linear trend method The trend components of time series, and with the trend feature of the slope term of linear function as the time series, i.e. setup time sequence Arrange xt, regression models of t=1,2 ... T on time t, xt=α+βtt, wherein α is intercept, and β is slope, and ε is error, β Least-squares estimation be:WhereinT represents the length of flow-time sequence;
The seasonal characteristics are obtained using H-P filter methods, pass through computational minimization time series xtWith Trend value ytBetween Difference estimate trend components:Wherein, T is the length of flow-time sequence Degree, λ is the penalty factor fluctuated to trend components, it can thus be concluded that periodic component:Its In, L is lag operator, works as CtThere is obvious peak value, it can be determined that time series xtWith cyclic swing composition, peak value institute is right The cycle answered is the Cycle Length of the time series;
The nonlinear characteristic using McLeod-Li- examine or Bispectral examine RESET examine or F examine or Neural Network Based Nonlinear test statistics reflects.
Above-mentioned global characteristics can be obtained by being not excluded for also other methods.
As one embodiment, the cluster is clustered including Kmeans, regard the global characteristics extracted as new feature Vector, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and K-means is carried out to new characteristic vector Cluster.
As one embodiment, the cluster include FCM cluster, using the global characteristics extracted as new feature to Amount, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and FCM clusters are carried out to new characteristic vector.
It is not excluded for also other clustering methods.
In order to more fully understand this method, the application example of this method detailed below:
A, the flow-time sequence { x for daily gathering each prediction base stationt, t=1,2 ... T }, continuous acquisition half a year;
B, extract each base station flow-time sequence global characteristics, including tendency feature, seasonal characteristics, kurtosis Feature, degree of bias feature, auto-correlation coefficient feature, nonlinear characteristic and spectrum signature;
C, using the global characteristics of each base station of extraction as new characteristic vector, now each base station flow-time sequence Row one new characteristic vector of correspondence, is clustered to new characteristic vector application K-means clustering methods;
D, to each class base station data after cluster according to the appropriate attribute of its feature selecting, if data on flows presents Gesture feature, gathers the ARPU value related to data on flows, 3G permeabilities;If data on flows is presented periodically, collection and stream Measure the related ARPU values of data, 3G permeabilities, total number of users;
E, set up one have three-decker, transmission function for tansig BP neural network structure and be trained;
F, the model trained to previous step, input attributive character and the last moment of the data on flows to be predicted of collection Flow, calculate the flow to be predicted, for example, input the attributive character of the data on flows of today of collection and the flow of yesterday, The flow of today can be predicted.
Wherein, the global characteristics of flow-time sequence are extracted in step B, are extracted by the following method:
B1, the tendency feature are weighed by Z statistics, and the calculation formula of Z statistics is:Wherein S is the statistic of Normal Distribution, and Var (S) is S variance, S calculation formula For:Var (S) calculation formula is:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T are the length of flow-time sequence, xjIt is flow-time sequence in the value at j moment, xkFor flow-time sequence It is listed in the value at k moment, sign function sgn (xj-xk) calculation formula be:
B2, the seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xtFFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be:
The calculation formula of kurtosis is in B3, the kurtosis feature:Wherein xtFor flow-time sequence, T=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence Standard deviation;
The calculation formula of the degree of bias is in B4, the degree of bias feature:Wherein xtFor flow-time Sequence, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is flow-time sequence Sample standard deviation;
B5, the auto-correlation coefficient feature are weighed with Ljung-Box-Q statistics, the meter of Ljung-Box-Q statistics Calculating formula is:Wherein T is the length of flow-time sequence, and p is considered maximum delayed rank Number, τ is delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
B6, the nonlinear characteristic are reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ... T, at moment s, w observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1), (xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m, R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors,
B7, the spectrum signature are the preceding second order coefficient of the DFT extracted.
Network traffic analysis and prediction meanss in one embodiment, as shown in Fig. 2 described device includes:
Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured;
Cluster module, for being clustered according to the global characteristics extracted;
Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting.
As shown in Fig. 2 a preferred embodiment of each module annexation of the present apparatus is:Extraction module, cluster module, Acquisition module and prediction module are linked in sequence successively.
Extraction module extracts the global characteristics of the flow-time sequence of each base station to be measured first;Then cluster module according to The global characteristics extracted are clustered;The attributive character of data on flows is gathered according to the result clustered by acquisition module again; The attributive character and the flow of last moment of the data on flows are inputted neural network structure by last prediction module, carry out flow Prediction, present apparatus network traffic analysis is more reasonable, and prediction contains much information, and precision is high, be adapted to application.
As one embodiment, the global characteristics include tendency feature or seasonal characteristics or kurtosis feature or the degree of bias Feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or multinomial.
As one embodiment, the flow-time sequence is by daily gathering the data on flows of each base station to be measured, even Continuous collection half a year obtains.
As one embodiment, the tendency feature is weighed by Z statistics, and Z statistics are more than zero, become to rise Gesture;Z statistics are less than zero, are downward trend;The calculation formula of Z statistics is:Wherein S is The statistic of Normal Distribution, Var (S) is S variance, and S calculation formula is:Var(S) Calculation formula be:Var(S)=T(T-1)(2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T are flow-time sequence Length, xjIt is flow-time sequence in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj- xk) calculation formula be:
The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xt FFT, i.e. FFT are carried out, t=1,2 ... T, T are the length of flow-time sequence, are obtained:The frequency wherein used is:Further calculating average frequency is:Calculate average period be:
The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t= 1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence Difference;
The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence Row, t=1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample of flow-time sequence This standard is poor;
The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, Ljung-Box-Q statistic detection flows Whether time series is white-noise process, and the calculation formula of Ljung-Box-Q statistics is:Its Middle T is the length of flow-time sequence, and p is considered maximum lag order, and τ is delayed issue, rτFor flow-time sequence Auto-correlation coefficient;rτCalculation formula be:Wherein xt is flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
The nonlinear characteristic reflects that BDS test statistics detection flows time serieses are by BDS test statistics No is independent same distribution, for flow-time sequence xt, t=1,2 ... T, at moment s, w observed value is xsAnd xw, then all sights Examine value (xs,xw) by being configured to:
{(xs,xw),(xs+1,xw+1),(xs+2,xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;BDS statistics Calculation formula be:Wherein r is interval size, and C (N, m, r) is phase Close integration, σ ' (N, m, r) be C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors,
The spectrum signature is the preceding second order coefficient of the DFT extracted, and the extraction of spectrum signature is using discrete Fourier transform coefficient, n levels number is as spectrum signature before can extracting, because the HFS of a signal is unimportant, Therefore before most of energy of domain space is concentrated on several coefficients.
As one embodiment, the tendency characteristic use linear trend method is obtained, and is isolated using linear trend method The trend components of time series, and with the trend feature of the slope term of linear function as the time series, i.e. setup time sequence Arrange xt, regression models of t=1,2 ... T on time t, xt=α+βtt, wherein α is intercept, and β is slope, and ε is error, β Least-squares estimation be:WhereinT represents the length of flow-time sequence;
The seasonal characteristics are obtained using H-P filter methods, pass through computational minimization time series xtWith Trend value ytBetween Difference estimate trend components:Wherein, T is the length of flow-time sequence, λ is the penalty factor fluctuated to trend components, it can thus be concluded that periodic component: Wherein, L is lag operator, works as CtThere is obvious peak value, it can be determined that time series xtWith cyclic swing composition, peak value institute The corresponding cycle is the Cycle Length of the time series;
The nonlinear characteristic using McLeod-Li- examine or Bispectral examine RESET examine or F examine or Neural Network Based Nonlinear test statistics reflects.
Above-mentioned global characteristics can be obtained by being not excluded for also other methods.
As one embodiment, the cluster is clustered including Kmeans, regard the global characteristics extracted as new feature Vector, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and K-means is carried out to new characteristic vector Cluster.
As one embodiment, the cluster include FCM cluster, using the global characteristics extracted as new feature to Amount, the flow-time sequence pair of each base station to be measured answers a new characteristic vector, and FCM clusters are carried out to new characteristic vector.
It is not excluded for also other clustering methods.
Embodiment described above only expresses the several embodiments of the present invention, and it describes more specific and detailed, but simultaneously Therefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the art For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention Protect scope.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims (8)

1. a kind of network traffic analysis and Forecasting Methodology, it is characterised in that comprise the following steps:
The global characteristics of the flow-time sequence of each base station to be measured are extracted, the global characteristics include tendency feature or season Property feature or kurtosis feature or degree of bias feature or auto-correlation coefficient feature or any one of nonlinear characteristic or spectrum signature or many ;
Global characteristics according to being extracted are clustered;
According to the result clustered, the attributive character of data on flows is gathered;
According to the attributive character of the data on flows and the flow of last moment, volume forecasting is carried out;
It is described that the attributive character of data on flows is gathered according to the result clustered, including:
To each class base station data after cluster according to its feature selecting attributive character related to data on flows, the attribute is special Levy including ARPU values, 3G permeabilities and/or total number of users.
2. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the flow-time sequence is led to The data on flows for daily gathering each base station to be measured is crossed, continuous acquisition half a year obtains.
3. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the tendency feature passes through Z statistics are weighed, and the calculation formula of Z statistics is:Wherein S is the system of Normal Distribution Metering, Var (S) is S variance, and S calculation formula is:Var (S) calculation formula is:Var (S)=T (T-1) (2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T is the length of flow-time sequence, xjFor flow Time series is in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-xk) calculation formula For:
The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xtCarry out fast Fast Fourier transform, i.e. FFT, t=1,2 ... T, T is the length of flow-time sequence, is obtained: The frequency wherein used is:Further calculating average frequency is:Calculate Average period is:
The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=1, 2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence Difference;
The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence, t= 1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence Difference;
The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, the calculation formula of Ljung-Box-Q statistics For:Wherein T is the length of flow-time sequence, and p is considered maximum lag order, and τ is Delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
The nonlinear characteristic is reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ... T, moment s, W observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1),(xs+2, xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m, R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors,
The spectrum signature is the preceding second order coefficient of the DFT extracted.
4. network traffic analysis according to claim 1 and Forecasting Methodology, it is characterised in that the cluster includes Kmeans Cluster, using the global characteristics extracted as new characteristic vector, the flow-time sequence pair of each base station to be measured answers one newly Characteristic vector, K-means clusters are carried out to new characteristic vector.
5. a kind of network traffic analysis and prediction meanss, it is characterised in that including:
Extraction module, the global characteristics of the flow-time sequence for extracting each base station to be measured, the global characteristics include becoming Gesture feature or seasonal characteristics or kurtosis feature or degree of bias feature or auto-correlation coefficient feature or nonlinear characteristic or frequency spectrum are special Any one of levy or multinomial;
Cluster module, for being clustered according to the global characteristics extracted;
Acquisition module, for according to the result clustered, gathering the attributive character of data on flows;
Prediction module, for the attributive character and the flow of last moment according to the data on flows, carries out volume forecasting;
Wherein, it is described according to the result clustered, the attributive character of data on flows is gathered, including:
To each class base station data after cluster according to its feature selecting attributive character related to data on flows, the attribute is special Levy including ARPU values, 3G permeabilities and/or total number of users.
6. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the flow-time sequence is led to The data on flows for daily gathering each base station to be measured is crossed, continuous acquisition half a year obtains.
7. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the tendency feature passes through Z statistics are weighed, and the calculation formula of Z statistics is:Wherein S is the system of Normal Distribution Metering, Var (S) is S variance, and S calculation formula is:Var (S) calculation formula is:Var (S)=T (T-1) (2T+5)/18;Flow-time sequence xt, t=1,2 ... T, T is the length of flow-time sequence, xjFor flow Time series is in the value at j moment, xkValue for flow-time sequence at the k moment, sign function sgn (xj-xk) calculation formula For:
The seasonal characteristics reflect that the calculation procedure of average period is by average period:To flow time series xtCarry out fast Fast Fourier transform, i.e. FFT, t=1,2 ... T, T is the length of flow-time sequence, is obtained: The frequency wherein used is:Further calculating average frequency is:Calculate Average period is:
The calculation formula of kurtosis is in the kurtosis feature:Wherein xtFor flow-time sequence, t=1, 2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence Difference;
The calculation formula of the degree of bias is in the degree of bias feature:Wherein xtFor flow-time sequence, t= 1,2 ... T, T are the length of flow-time sequence,For the average of flow-time sequence, σ is the sample canonical of flow-time sequence Difference;
The auto-correlation coefficient feature is weighed with Ljung-Box-Q statistics, the calculation formula of Ljung-Box-Q statistics For:Wherein T is the length of flow-time sequence, and p is considered maximum lag order, and τ is Delayed issue, rτFor the auto-correlation coefficient of flow-time sequence;rτCalculation formula be:Wherein xtFor flow-time sequence, t=1,2 ... T,For the average of flow-time sequence;
The nonlinear characteristic is reflected by BDS test statistics, for flow-time sequence xt, t=1,2 ... T, moment s, W observed value is xsAnd xw, then all observed value (xs,xw) by being configured to:{(xs,xw),(xs+1,xw+1),(xs+2, xw+2),…(xs+m-1,xw+m-1), wherein m is embedded interval;The calculation formula of BDS statistics is:Wherein r is interval size, and C (N, m, r) is correlation intergal, σ ' (N, m, R) for C (N, m, r)-C (N, 1, r)mProgressive standard deviation estimation;C (N, m, r) calculation formula is:WhereinIt is m dimensional vectors,
The spectrum signature is the preceding second order coefficient of the DFT extracted.
8. network traffic analysis according to claim 5 and prediction meanss, it is characterised in that the cluster includes Kmeans Cluster, using the global characteristics extracted as new characteristic vector, the flow-time sequence pair of each base station to be measured answers one newly Characteristic vector, K-means clusters are carried out to new characteristic vector.
CN201410019136.7A 2014-01-15 2014-01-15 Network traffic analysis and Forecasting Methodology and device Active CN103747477B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410019136.7A CN103747477B (en) 2014-01-15 2014-01-15 Network traffic analysis and Forecasting Methodology and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410019136.7A CN103747477B (en) 2014-01-15 2014-01-15 Network traffic analysis and Forecasting Methodology and device

Publications (2)

Publication Number Publication Date
CN103747477A CN103747477A (en) 2014-04-23
CN103747477B true CN103747477B (en) 2017-08-25

Family

ID=50504455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410019136.7A Active CN103747477B (en) 2014-01-15 2014-01-15 Network traffic analysis and Forecasting Methodology and device

Country Status (1)

Country Link
CN (1) CN103747477B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10417226B2 (en) * 2015-05-29 2019-09-17 International Business Machines Corporation Estimating the cost of data-mining services
CN107517166A (en) * 2016-06-16 2017-12-26 中兴通讯股份有限公司 Flow control methods, device and access device
TWI641251B (en) 2016-11-18 2018-11-11 財團法人工業技術研究院 Method and system for monitoring network flow
CN107135126B (en) * 2017-05-22 2020-03-24 安徽师范大学 Flow online identification method based on sub-flow fractal index
CN110098944B (en) * 2018-01-29 2020-09-08 中国科学院声学研究所 Method for predicting protocol data traffic based on FP-Growth and RNN
CN108770002B (en) * 2018-04-27 2021-08-10 广州杰赛科技股份有限公司 Base station flow analysis method, device, equipment and storage medium
CN108960537B (en) * 2018-08-17 2020-10-13 安吉汽车物流股份有限公司 Logistics order prediction method and device and readable medium
CN113037577B (en) * 2019-12-09 2023-03-24 中国电信股份有限公司 Network traffic prediction method, device and computer readable storage medium
CN112235152B (en) * 2020-09-04 2022-05-10 北京邮电大学 Flow size estimation method and device
CN111935766B (en) * 2020-09-15 2021-01-12 之江实验室 Wireless network flow prediction method based on global spatial dependency
CN113225824A (en) * 2021-04-28 2021-08-06 辽宁邮电规划设计院有限公司 Device and method for automatically allocating bandwidths with different service requirements based on 5G technology
CN114330145B (en) * 2022-03-01 2022-07-12 北京蚂蚁云金融信息服务有限公司 Method and device for analyzing sequence based on probability map model
CN114793197B (en) * 2022-03-29 2023-09-19 广州杰赛科技股份有限公司 Network resource allocation method, device, equipment and storage medium based on NFV
CN115949891B (en) * 2023-03-09 2023-05-23 天津佰焰科技股份有限公司 Intelligent control system and control method for LNG (liquefied Natural gas) station

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050213514A1 (en) * 2004-03-23 2005-09-29 Ching-Fong Su Estimating and managing network traffic
CN101252541A (en) * 2008-04-09 2008-08-27 中国科学院计算技术研究所 Method for establishing network flow classified model and corresponding system thereof
CN103227999A (en) * 2013-05-02 2013-07-31 中国联合网络通信集团有限公司 Network traffic prediction method and device
CN103368811A (en) * 2012-04-06 2013-10-23 华为终端有限公司 Bandwidth distribution method and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050213514A1 (en) * 2004-03-23 2005-09-29 Ching-Fong Su Estimating and managing network traffic
CN101252541A (en) * 2008-04-09 2008-08-27 中国科学院计算技术研究所 Method for establishing network flow classified model and corresponding system thereof
CN103368811A (en) * 2012-04-06 2013-10-23 华为终端有限公司 Bandwidth distribution method and equipment
CN103227999A (en) * 2013-05-02 2013-07-31 中国联合网络通信集团有限公司 Network traffic prediction method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于时间序列的网络流量分析与预测;何建;《中国科技信息》;20051231(第22期);全文 *

Also Published As

Publication number Publication date
CN103747477A (en) 2014-04-23

Similar Documents

Publication Publication Date Title
CN103747477B (en) Network traffic analysis and Forecasting Methodology and device
Bayer et al. Kumaraswamy autoregressive moving average models for double bounded environmental data
CN108280552B (en) Power load prediction method and system based on deep learning and storage medium
CN108593990B (en) Electricity stealing detection method based on electricity consumption behavior mode of electric energy user and application
Huo et al. Artificial neural network models for reference evapotranspiration in an arid area of northwest China
CN106533750B (en) The forecasting system and method for non-stationary type application user concurrent amount under a kind of cloud environment
CN104933622A (en) Microblog popularity degree prediction method based on user and microblog theme and microblog popularity degree prediction system based on user and microblog theme
CN104252649A (en) Regional wind power output prediction method based on correlation between multiple wind power plants
CN109101938A (en) A kind of multi-tag age estimation method based on convolutional neural networks
CN107741578B (en) Original meter reading data processing method for remote calibration of running error of intelligent electric energy meter
CN106600037B (en) Multi-parameter auxiliary load prediction method based on principal component analysis
CN110598336B (en) Water consumption prediction method and device for water heater, water heater and electronic equipment
CN113556629B (en) Intelligent ammeter error remote estimation method and device
CN109307159A (en) A kind of pipe network model alarm method based on water consumption optimal prediction model
CN109063885A (en) A kind of substation's exception metric data prediction technique
CN105740989B (en) A kind of water supply network anomalous event method for detecting based on VARX model
CN115455707A (en) Method for analyzing influence of drainage basin water resource engineering on meteorological-hydrological drought
Geraldo-Ferreira et al. Modelling net radiation at surface using “in situ” netpyrradiometer measurements with artificial neural networks
Schirmer et al. Energy disaggregation from low sampling frequency measurements using multi-layer zero crossing rate
CN110186533A (en) A kind of short-term tide prediction method in high-precision river mouth
CN106789265A (en) The clustering method and device of a kind of service cluster
CN112380126B (en) Web system health prediction device and method
CN103810401A (en) Two-dimensional runoff restoration method for separating influences of human activities
CN109272144A (en) The prediction technique of grassland in northern China area NDVI based on BPNN
CN115564093A (en) Water use prediction and early warning method for chemical industry

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant