CN108062561B - Short-time data flow prediction method based on long-time and short-time memory network model - Google Patents

Short-time data flow prediction method based on long-time and short-time memory network model Download PDF

Info

Publication number
CN108062561B
CN108062561B CN201711264618.9A CN201711264618A CN108062561B CN 108062561 B CN108062561 B CN 108062561B CN 201711264618 A CN201711264618 A CN 201711264618A CN 108062561 B CN108062561 B CN 108062561B
Authority
CN
China
Prior art keywords
data flow
training
short
sample
test sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711264618.9A
Other languages
Chinese (zh)
Other versions
CN108062561A (en
Inventor
薛洋
薛泽龙
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201711264618.9A priority Critical patent/CN108062561B/en
Publication of CN108062561A publication Critical patent/CN108062561A/en
Application granted granted Critical
Publication of CN108062561B publication Critical patent/CN108062561B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a short-time data flow prediction method based on a long-time and short-time memory network model, which comprises the following steps: firstly, obtaining a plurality of training samples in an observation point, then extracting the characteristics of the training samples, classifying the training samples according to the characteristics of the training samples, and respectively classifying the training samples into two types, namely a severe and gentle data flow value change trend or two types, namely an ascending and descending change trend; and training all training samples aiming at the LSTM model to obtain a trained main model, and respectively training the main model by adopting two types of training samples to respectively obtain a first type sub model and a second type sub model. Obtaining an observation point test sample, classifying the test sample through a classifier, inputting the test sample into a first type submodel or a second type submodel according to a classification result, and predicting a quantity flow value of a next time point of an observation point through the first type submodel or the second type submodel. The method improves the accuracy of short-time data stream prediction.

Description

Short-time data flow prediction method based on long-time and short-time memory network model
Technical Field
The invention belongs to the technical field of pattern recognition and artificial intelligence, and particularly relates to a short-time data flow prediction method based on a long-time and short-time memory network model.
Background
With the continuous and steady development of the world economy, the services in many countries are data streams, such as network load flow and traffic flow, which represent the characteristics of the services, and the prediction is the best way to optimize the services, such as predicting the load data of the internet, so that the scheduling for the next moment is provided for more suitable resource scheduling; for the prediction of the traffic flow, the configuration of the traffic resource can be optimized.
At present, many methods for predicting short-time data flow use a single model for prediction, however, data flow is a signal with nonlinearity and randomness, and a plurality of mixed components are difficult to distinguish and separate. Therefore, the prediction effect of a single model has a bottleneck, and when one model has a better prediction effect on the data stream under the condition of congestion, the prediction effect on the data stream under the condition of unblocked traffic is often to be improved, so that the existing short-time data stream prediction method has the defect of low prediction precision.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provide a short-time data stream prediction method based on a Long Short Term Memory (LSTM) network model, and the method can more accurately predict the short-time data stream.
The purpose of the invention is realized by the following technical scheme: a short-time data flow prediction method based on a long-time and short-time memory network model comprises the following steps:
step S1, aiming at an observation point of the data stream needing to be predicted in a short time, firstly collecting data flow values counted by a plurality of historical time points at the observation point, and then splicing and aggregating the collected data flow values counted by each historical time point into a one-dimensional array according to a time sequence and carrying out normalization processing; wherein the time intervals of every two adjacent time points are the same and are both T minutes; the data flow value counted at each time point refers to the data flow generated in the time interval from the last time point to the time point;
step S2, performing windowing processing of a sliding window aiming at the normalized one-dimensional array obtained in the step S1 to obtain a plurality of training samples, and taking a data flow value counted in each training sample as a data flow value counted at a next time point corresponding to a last time point as a label of the training sample; each training sample comprises data flow values counted at a plurality of time points;
step S3, extracting the characteristics of the training sample: performing first-order difference processing on the training sample to obtain the characteristics of the training sample; clustering the training samples according to the characteristics of the training samples, and separating two types of training samples with violent and gentle data flow value change trends or separating two types of training samples with ascending and descending data flow value change trends;
step S4, obtaining an LSTM model after model parameter initialization; then training the LSTM model, specifically: firstly, training an LSTM model by taking each training sample in all training samples as input and taking a label corresponding to each training sample as output to obtain a trained main model; inputting a class of training samples with violent or ascending data flow value variation trend acquired in the step S3 into the trained main model for training to obtain a first class of sub-model; meanwhile, inputting a class of training samples with a gentle or reduced data flow value change trend acquired in the step S3 into the trained main model for training to obtain a second class of sub models;
step S5, when the data flow value counted at the next time point of the observation point is to be predicted at the current time point, firstly, a sample composed of the data flow value counted at the current time point of the observation point and a plurality of time points before the current time point is obtained through a sliding window, and the sample is used as a test sample; in the test sample, the time intervals of every two adjacent time points are the same and are both T1 minutes; wherein T1 ═ T; the time interval between the next time point and the current time point of the data flow value to be predicted is also T minutes;
step S6, extracting the characteristics of the test sample: performing first-order difference processing on the test sample to obtain the characteristics of the test sample; then, judging whether the test sample belongs to a sample with a violent or ascending data flow value variation trend or belongs to a sample with a gentle or descending data flow value variation trend by the classifier according to the characteristics of the test sample; the classifier is obtained by training a training sample with the type separated in the step S3 as an input and a type to which the training sample belongs as an output;
s7, when the test sample belongs to a sample with a violent or ascending data flow value change trend, inputting the test sample into the first type submodel obtained in the S4, and predicting the data flow value counted at the next time point of the observation point through the first type submodel;
and when the test sample belongs to a sample with a gradual or reduced data flow value change trend, inputting the test sample into the second type submodel acquired in the step S4, and predicting the data flow value counted at the next time point of the observation point through the second type submodel.
Preferably, in step S1, the normalization process is performed on the one-dimensional array obtained after the splicing and aggregation in the following manner:
Figure BDA0001494340480000031
wherein xfIs the f-th dimension in the one-dimensional array; x is the number ofminAnd xmaxRespectively corresponding to the maximum value and the minimum value in the spliced and aggregated one-dimensional array.
Preferably, the length of the sliding window is N, wherein when the data stream is a traffic data stream, the product of the length of the sliding window and the time interval T of each two adjacent time points satisfies the following relationship: NxT is less than or equal to 60.
Furthermore, in the step S1, in the step S1, when the data stream is a traffic data stream, T is less than or equal to 30;
and when the observation point history is data flow counted every E minutes, T is E, 2E, …, (n-1) E or nE, wherein n and E are both a certain value, and nE is less than or equal to 30.
Further, when T is Y minutes, the length N of the sliding window is an integer value of 2 to 60/Y.
Preferably, in step S3, if the data stream is a traffic data stream, the feature extraction process of the training sample is as follows: performing first-order difference processing on the training samples, and taking an absolute value of a result of each first-order difference as a characteristic of the training samples; wherein
When the training sample is [ x ]txt-1... xt-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1];
whereinxt、xt-1,...,xt-N+1Respectively corresponding to the data flow values counted at time points t, t-1, … and t-N + 1; n is the length of the sliding window;
the characteristics of the training sample obtained in step S3 are:
[|xt-xt-1|,|xt-1-xt-2|,...|xt-N+2-xt-N+1|];
in step S3, if the data flow is a network load data flow, the feature extraction process of the training sample is as follows: performing first-order difference processing on the training sample, and taking a first-order difference result of the training sample as the characteristic of the training sample; wherein
When the training sample is [ x ]txt-1... xt-N+1]And then, performing first-order difference processing to obtain the characteristics of the training sample as follows:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1]。
preferably, in step S3, according to the characteristics of the training samples, all the training samples are separated by K-means clustering, so as to separate two types of training samples with a severe data flow rate value variation trend and a moderate data flow rate value variation trend, or two types of training samples with an increasing and decreasing data flow rate value variation trend.
Preferably, in step S4, when initializing the LSTM model, a matrix for specifying the dimension division is first generated, then singular value decomposition is performed to generate three matrices, i.e., a matrix U, a matrix Σ, and a matrix V, the matrix U is used as an initial value of a weight matrix of an input gate, a forgetting gate, an output gate, and candidate state values in the LSTM model hidden layer, and all the offset vectors in the LSTM model are set to 0.
Preferably, in step S6, the classifier is a K-nearest neighbor classifier.
Preferably, in step S6, if the data stream is a traffic data stream, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the training sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1];
wherein xt′Data flow value counted for current time point t
xt′-1,...,xt′-N+1Respectively corresponding to the data flow values counted at time points t, t-1, … and t-N + 1; n is the length of the sliding window;
the characteristics of the test sample obtained in step S6 are:
[|xt′-xt′-1|,|xt′-1-xt′-2|,...|xt′-N+2-xt′-N+1|];
in step S6, if the data flow is a network load data flow, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the test sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]And then, performing first-order difference processing to obtain the characteristics of the test sample as follows:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1]。
compared with the prior art, the invention has the following advantages and effects:
(1) in the short-time data flow prediction method, a plurality of training samples in an observation point are obtained firstly, wherein each training sample correspondingly comprises a data flow value counted by a plurality of continuous time points: then, performing first-order difference on the training samples, taking absolute values as the characteristics of the training samples, and classifying the training samples according to the characteristics of the training samples, wherein the training samples are respectively in two types, namely a type that the data flow rate value change trend is severe or rising and a type that the data flow rate value change trend is slow or falling; then, each training sample in all the training samples is adopted to respectively train the LSTM model to obtain a trained main model, and then two types of training samples are respectively adopted to respectively train the main model to respectively obtain a first type sub model and a second type sub model. When the data flow value of the next time point of the observation point needs to be predicted, firstly, a test sample of the observation point is obtained, the test sample is classified through a classifier, so that the data flow change trend in the test sample is obtained, then the test sample is input into a first type sub model or a second type sub model according to the classification result, and the quantity flow value of the next time point of the observation point is predicted through the first type sub model or the second type sub model. According to the method, the submodel suitable for sharp or rising of the data flow value variation trend and the submodel suitable for slow or falling of the data flow value variation trend are obtained through the two types of training samples, after the test sample is obtained, the data flow variation trend of the test sample can be distinguished, and then the data flow prediction of the next time point is carried out through different submodels according to the data flow variation trend of the test sample; therefore, the method can model the flow with nonlinearity and randomness, and then predict the data flow value of the next time point at the current time point through the data flow values counted at the current time point and the previous time point of the observation point, and has the advantage of high accuracy of short-time data flow prediction.
(2) In the short-time data flow prediction method, aiming at an observation point of a short-time data flow needing to be predicted, historical data flow values counted by a plurality of continuous time points are collected at the observation point, and then the collected data flow values counted by each time point in the history are spliced and aggregated into a one-dimensional array according to a time sequence and are subjected to normalization processing; performing windowing processing on a sliding window aiming at the one-dimensional array obtained by aggregation so as to obtain a plurality of training samples; in the method, the training sample is composed of data flow values counted by a plurality of continuous time points; the time points can be continuous time points when the original data actually counts the data flow value or time points separated by a certain time, so that the training samples are very easy to obtain.
(3) According to the short-time data stream prediction method, the training sample is subjected to first-order difference, and then the absolute value is taken as the characteristic of the training sample, wherein the first-order difference characteristic of the training sample can represent the change rate of the data stream, and the change trend of the data stream can be effectively represented after the absolute value is taken. Therefore, the invention can accurately classify the types of the training samples by taking the first-order difference absolute value as the characteristic of the training samples, and provides further guarantee for accurate prediction of the short-time data stream.
(4) According to the short-time data flow prediction method, all training samples are separated through K-means clustering according to the characteristics of the training samples, and two types of training samples with violent and gentle data flow value change trends or two types of training samples with ascending and descending data flow value change trends can be effectively separated.
Drawings
FIG. 1 is a flow chart of a short-term data stream prediction method according to the present invention.
Fig. 2 is a graph showing the effect of K-means clustering on the data flow values randomly acquired at each time point in one day when the time interval between every two adjacent time points is 5 minutes in example 1 of the present invention.
Fig. 3 is a graph showing the effect of K-means clustering on the data flow rate values counted at each time point in the year when the time interval between every two adjacent time points is 5 minutes in example 1 of the present invention.
Fig. 4 is a comparison graph of the predicted value of the flow rate value at each time point and the actually counted data flow rate value when the time interval between every two adjacent time points is 5 minutes in example 1 of the present invention.
FIG. 5 is a MAPE comparison graph of the prediction data streams of the present invention and other models in example 1 with different length sliding windows at 5 minutes intervals between two adjacent time points.
Fig. 6 is a graph showing the effect of K-means clustering on the data flow rate values counted at each time point in 10 days taken at random when the time interval between every two adjacent time points is 60 minutes in example 2 of the present invention.
Fig. 7 is a graph showing the effect of K-means clustering on the data flow rate values counted at each time point in 100 days randomly taken when the time interval between every two adjacent time points is 60 minutes in example 2 of the present invention.
Fig. 8 is a comparison graph of the predicted value of the flow rate value at each time point and the actually counted data flow rate value when the time interval between two adjacent time points is 60 minutes in example 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Examples
The embodiment discloses a short-term data flow prediction method based on a long-term and short-term memory network model (LSTM), as shown in fig. 1, the steps are as follows:
step S1, aiming at an observation point of the data stream needing to be predicted in a short time, firstly collecting data flow values counted by a plurality of historical time points at the observation point, and then splicing and aggregating the collected data flow values counted by each historical time point into a one-dimensional array according to a time sequence and carrying out normalization processing; wherein the time intervals of every two adjacent time points are the same and are both T minutes; the data flow value counted at each time point refers to the data flow generated in the time interval from the last time point to the time point;
in this step, normalization processing is performed on the one-dimensional array obtained after splicing and polymerization in the following manner:
Figure BDA0001494340480000071
wherein xfIs the f-th dimension in the one-dimensional array; x is the number ofminAnd xmaxRespectively is the maximum value and the minimum value in the spliced and polymerized one-dimensional array.
In the step, T is a fixed value, and when the data stream is a traffic data stream, T is less than or equal to 30;
when the observation point history is data flow counted once every E minutes, T can be E, 2E, …, (n-1) E or nE, wherein n and E are both a certain value, and nE is less than or equal to 30 when the data flow is a traffic data flow. For example, when the observation point is to count data traffic every 5 minutes, then T may be 5, 10, 15, 20, 25, or 30.
Step S2, performing windowing processing of a sliding window aiming at the normalized one-dimensional array obtained in the step S1 to obtain a plurality of training samples, and taking a data flow value counted in each training sample as a data flow value counted at a next time point corresponding to a last time point as a label of the training sample; each training sample comprises data flow values counted at a plurality of time points; e.g. for training sample xtxt-1... xt-N+1],xtThe data flow value counted for the time point t is the data flow value counted for the last time point in the data flow values counted for all the time points in the training sample, and in this embodiment, the data flow value x counted for the next time point t +1 of the last time point t is usedt+1As a label for the training sample.
In this step, the length of the sliding window is N, wherein the product of the length of the sliding window and the time interval T of each two adjacent time points satisfies the following relationship: NxT is less than or equal to 60.
In this step, when T is Y minutes, the length N of the sliding window is an integer value of 2 to 60/Y.
Step S3, extracting the characteristics of the training sample: performing first-order difference processing on the training samples, and taking an absolute value of a result of each first-order difference as a characteristic of the training samples; clustering the training samples according to the characteristics of the training samples, and separating two types of training samples with violent data flow value change trend and gentle data flow value change trend, or separating two types of training samples with ascending data flow value change trend and descending data flow value change trend;
in this step, if the data stream is a traffic data stream, the feature extraction process of the training sample is as follows: performing first-order difference processing on the training samples, and taking an absolute value of a result of each first-order difference as a characteristic of the training samples; wherein
When the training sample is [ x ]txt-1... xt-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1];
wherein xt、xt-1,...,xt-N+1Respectively corresponding to the data flow values counted at time points t, t-1, … and t-N + 1; n is the length of the sliding window; x is the number oftNamely, the data flow value at the time point t refers to the data flow generated in the time interval from the time point t-1 to the time point t; the meaning of the data flow value counted at other time points is the same.
The training samples obtained in this step are characterized as follows:
[|xt-xt-1|,|xt-1-xt-2|,...|xt-N+2-xt-N+1|];
in this step, if the data stream is a network load data stream, the feature extraction process of the training sample is as follows: performing first-order difference processing on the training sample, and taking a first-order difference result of the training sample as the characteristic of the training sample; wherein
When the training sample is [ x ]txt-1... xt-N+1]And then, performing first-order difference processing to obtain the characteristics of the training sample as follows:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1]。
in the step, according to the characteristics of the training samples, all the training samples are separated through K-means clustering to separate two types of training samples with sharp and gentle data flow value variation trends, or two types of training samples with rising and falling data flow value variation trends. In this embodiment, when the data stream is a traffic data stream, two types of training samples with a sharp and gentle data traffic value variation trend are separated, and when the data stream is a network load data stream, two types of training samples with a data traffic value variation trend that increases and decreases are separated.
Step S4, obtaining an LSTM model after model parameter initialization; then training the LSTM model, specifically: firstly, training an LSTM model by taking each training sample in all training samples as input and taking a label corresponding to each training sample as output to obtain a trained main model; inputting a class of training samples with violent or ascending data flow value variation trend acquired in the step S3 into the trained main model for training to obtain a first class of sub-model; in addition, inputting a class of training samples with a gentle or reduced data traffic value change trend acquired in the step S3 into the trained main model for training to obtain a second class of sub models;
in the step, when the LSTM model is initialized, a matrix of the specified dimension division is generated firstly, then singular value decomposition is carried out to generate three matrixes of a matrix U, a matrix sigma and a matrix V, and the U matrix is used as a weight matrix W of an input gate, a forgetting gate, an output gate and candidate state values in the hidden layer of the LSTM modeli、Wf、Wo、Wc、Ui、Uf、Uo、Uc、VoIs used to bias the bias vector b in the LSTM modeli、bf、bo、bcAre all taken as 0.
In this step, the training process of the main model after the LSTM model training is: firstly, giving weight matrix W of input gate, forgetting gate, output gate and candidate state value in LSTM model hidden layeri、Wf、Wo、Wc、Ui、Uf、Uo、Uc、VoInitial and offset vectors bi、bf、bo、bcAfter the training sample is input, calculating the gradient of the loss function through forward propagation, updating the model parameters through backward propagation, and finishing the training when the model parameters are updated to be convergent or the maximum iteration number.
In step S5, when the data flow rate value counted at the next time point of the observation point is to be predicted at the current time point, first, the data flow rate value is predictedObtaining a sample consisting of normalized data flow values counted at the current time point of the observation point and a plurality of time points in front of the current time point through a sliding window, and taking the sample as a test sample; in the test sample, the time intervals of every two adjacent time points are the same and are both T1; wherein T1 ═ T; the time interval between the next time point and the current time point of the data flow value to be predicted is also T; the data flow value in the test sample is also x acquired in step S1minAnd xmaxCarrying out normalization; the normalization formula is identical to that in step S1.
Step S6, extracting the characteristics of the test sample: performing first-order difference processing on the test sample to obtain the characteristics of the test sample; then, judging whether the test sample belongs to a sample with a violent or ascending data flow value variation trend or belongs to a sample with a gentle or descending data flow value variation trend by the classifier; in this embodiment, when the data stream is a traffic data stream, the classifier determines whether the test sample belongs to a sample with a severe data flow value variation trend or a sample with a gentle data flow value variation trend; when the data flow is the network load data flow, the classifier judges whether the test sample belongs to a sample with an increasing data flow value change trend or a sample with a decreasing data flow value change trend;
the classifier is a K-nearest neighbor classifier, and is obtained by training a training sample of the type separated in the step S3 as an input and a type to which the training sample belongs as an output; the training process of the K-nearest neighbor algorithm is inert, only the training samples need to be saved, and K training samples closest to the training samples are calculated when the test samples are received, wherein K in the K-nearest neighbor classifier in the embodiment is 10. In this embodiment, the K-nearest neighbor classifier adopts a weighted voting strategy, that is, the inverse of the distance is used as a weight coefficient in voting, and the closer the distance is, the greater the influence on the determination of the final class is.
In the above step S6, if the data stream is a traffic data stream, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the training sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1];
wherein xt′Data flow value counted for current time point t
xt′-1,...,xt′-N+1Respectively corresponding to the data flow values counted at time points t, t-1, … and t-N + 1; n is the length of the sliding window;
the characteristics of the test sample obtained in step S6 are:
[|xt′-xt′-1|,|xt′-1-xt′-2|,...|xt′-N+2-xt′-N+1|];
in step S6, if the data flow is a network load data flow, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the test sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]And then, performing first-order difference processing to obtain the characteristics of the test sample as follows:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1]。
s7, when the test sample belongs to a sample with a violent or ascending data flow value change trend, inputting the test sample into the first type submodel obtained in the S4, and predicting the data flow value counted at the next time point of the observation point through the first type submodel;
and when the test sample belongs to a sample with a gradual or reduced data flow value change trend, inputting the test sample into the second type submodel acquired in the step S4, and predicting the data flow value counted at the next time point of the observation point through the second type submodel.
Example 1
The method of the embodiment is applied to the prediction of traffic data flow, and specifically comprises the following steps:
in this embodiment, traffic data traffic values for 52 weeks from 1/2015 to 12/2015 and 30/2015 at a certain traffic observation point are taken, wherein the traffic data traffic values are acquired by sensors every 30 seconds. In this embodiment, after the weekend and holiday days are removed, traffic data flow values of 247 days remain. In the experiment, a training sample set is obtained through the traffic data flow value counted at each time point in the first 200 days, and a testing sample set is obtained through the traffic data flow value counted at each time point in the last 47 days.
As shown in fig. 2, when T is 5 in this embodiment, that is, when the time interval between every two adjacent time points is 5 minutes, when the training samples acquired at random one day from 1/2015 to 12/2015 and 30/2015 are K-means clustered, two types of training samples with a sharp data traffic value variation trend and a gentle data traffic value variation trend are separated. As shown in fig. 3, when T is 5 in this embodiment, that is, when the time interval between every two adjacent time points is 5 minutes, when training samples acquired through data flow all year around 2015, 1 month and 1 day to 2015, 12 month and 30 days, two types of training samples with a sharp data traffic value variation trend and a gentle data traffic value variation trend are separated.
For the obtained training sample set and the test sample set, when T is 5, the comparison between the data flow value at each time point predicted by the method of this embodiment and the actually counted data flow value at each time point is shown in fig. 4; as can be seen from fig. 4, the prediction result of the method of this embodiment well fits the actually observed data flow value, and it can be seen that the data flow value prediction accuracy of the method of this embodiment is very high.
When sliding windows with different lengths are adopted, the method of the present embodiment is shown in fig. 5, for example, the Average absolute percentage error MAPE pair of the data stream predicted by using other 4 models is shown in fig. 5, where the models commonly used in the field of short-time data stream prediction are a Historical mean Model (HA), a K Nearest Neighbor (KNN), a Support Vector Regression (SVR) Model and a long-short time memory Model (single LSTM) of a single Model, where the Multi-Model LSTM in fig. 5 is the Average absolute percentage error MAPE of the data stream predicted by using sliding windows with different lengths in the method. As can be seen from fig. 5, when sliding windows of different lengths are used, the method of the present embodiment predicts lower MAPE for the data stream than the other four models, especially the long-short time memory model compared with the single model.
When T takes different values (T ═ 10, 15, 20, and 25), the prediction accuracy of the method of the present embodiment and the other four models are shown in table 1, respectively:
TABLE 1
Figure BDA0001494340480000121
Wherein MAPE (Mean Absolute percentage Error) and RMSE (root Mean Square Error) are respectively evaluation standards of prediction accuracy; respectively as follows:
Figure BDA0001494340480000123
wherein M represents the total number of test samples, xi,realDenotes the true label, x, corresponding to the ith test samplei,preThe data flow value obtained after the prediction is performed on the ith test sample by the method of the embodiment.
As can be seen from table 1, when T takes different values, both prediction error indicators in the method of the present embodiment are lower than those in the other 4 methods.
Example 2
The method of the embodiment is applied to the prediction of the network load data traffic, and specifically comprises the following steps:
in this embodiment, a network flow load data log of wikipedia from 2014 to 2016 is obtained, wherein the number of persons accessing a platform is recorded once per hour, then, the network observation point 2014 from 6 months 1 to 2015 12 months 30 is collected for 45510 hours in total, in the experiment, the first 36408 hours are taken as a training sample set, and the last 9102 hours are taken as a test sample set.
As shown in fig. 6, when T is 60 in this embodiment, that is, when the time interval between every two adjacent time points is 60 minutes, when training samples acquired at random 10 days from 1/6/2014 to 30/12/2015 are K-means clustered, the separated two types of training samples with an increasing trend of data traffic value change and a decreasing trend of data traffic value change are shown in the effect diagram. As shown in fig. 7, when T is 60 in this embodiment, that is, when the time interval between every two adjacent time points is 60 minutes, when K-means clustering is performed on training samples acquired through random 100 days in 6/1/2014 to 12/30/2015, two types of training samples with separated data traffic value variation trends rising and falling are shown.
For the obtained training sample set and the test sample set, when T is 5, the comparison between the data flow value at each time point predicted by the method of this embodiment and the actually counted data flow value at each time point is shown in fig. 8; as can be seen from fig. 8, the prediction result of the method of this embodiment well fits the actual data flow value, and it can be seen that the data flow prediction accuracy of the method of this embodiment is very high.
When T is 60, the prediction accuracy of the method and the other four models are respectively shown in table 2, where the other 4 models respectively refer to a historical mean Model (HA), a K Nearest Neighbor Model (KNN), a Support Vector Regression (SVR) Model and a single Model long-short time memory Model (single LSTM) which are commonly used in the short-time data stream prediction field, and the Multi-Model LSTM is the average absolute percentage error MAPE and the root mean square error of the data stream predicted by the method of the present embodiment.
TABLE 2
Figure BDA0001494340480000131
Figure BDA0001494340480000141
As can be seen from table 2, the prediction error index of the prediction accuracy of the method of the present embodiment is lower than that of the other 4 models.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (10)

1. A short-time data flow prediction method based on a long-time and short-time memory network model is characterized by being applied to the prediction of traffic data flow or the prediction of network load data flow, and comprising the following steps of:
step S1, aiming at an observation point of the data stream needing to be predicted in a short time, firstly collecting data flow values counted by a plurality of historical time points at the observation point, and then splicing and aggregating the collected data flow values counted by each historical time point into a one-dimensional array according to a time sequence and carrying out normalization processing; wherein the time intervals of every two adjacent time points are the same and are both T minutes; the data flow value counted at each time point refers to the data flow generated in the time interval from the last time point to the time point;
step S2, performing windowing processing of a sliding window aiming at the normalized one-dimensional array obtained in the step S1 to obtain a plurality of training samples, and taking a data flow value counted in each training sample as a data flow value counted at a next time point corresponding to a last time point as a label of the training sample; each training sample comprises data flow values counted at a plurality of time points;
step S3, extracting the characteristics of the training sample: performing first-order difference processing on the training sample to obtain the characteristics of the training sample; clustering the training samples according to the characteristics of the training samples, and separating two types of training samples with violent and gentle data flow value change trends or separating two types of training samples with ascending and descending data flow value change trends;
step S4, obtaining an LSTM model after model parameter initialization; then training the LSTM model, specifically: firstly, training an LSTM model by taking each training sample in all training samples as input and taking a label corresponding to each training sample as output to obtain a trained main model; inputting a class of training samples with violent or ascending data flow value variation trend acquired in the step S3 into the trained main model for training to obtain a first class of sub-model; meanwhile, inputting a class of training samples with a gentle or reduced data flow value change trend acquired in the step S3 into the trained main model for training to obtain a second class of sub models;
step S5, when the data flow value counted at the next time point of the observation point is to be predicted at the current time point, firstly, a sample composed of the data flow value counted at the current time point of the observation point and a plurality of time points before the current time point is obtained through a sliding window, and the sample is used as a test sample; in the test sample, the time intervals of every two adjacent time points are the same and are both T1 minutes; wherein T1 ═ T; the time interval between the next time point and the current time point of the data flow value to be predicted is also T minutes;
step S6, extracting the characteristics of the test sample: performing first-order difference processing on the test sample to obtain the characteristics of the test sample; then, judging whether the test sample belongs to a sample with a violent or ascending data flow value variation trend or belongs to a sample with a gentle or descending data flow value variation trend by the classifier according to the characteristics of the test sample; the classifier is obtained by training a training sample with the type separated in the step S3 as an input and a type to which the training sample belongs as an output;
s7, when the test sample belongs to a sample with a violent or ascending data flow value change trend, inputting the test sample into the first type submodel obtained in the S4, and predicting the data flow value counted at the next time point of the observation point through the first type submodel;
and when the test sample belongs to a sample with a gradual or reduced data flow value change trend, inputting the test sample into the second type submodel acquired in the step S4, and predicting the data flow value counted at the next time point of the observation point through the second type submodel.
2. The method for predicting short-term data stream based on long-term and short-term memory network model according to claim 1, wherein in step S1, the normalization process is performed on the one-dimensional array obtained after the concatenation and aggregation by:
Figure FDA0002127836430000021
wherein xfIs the f-th dimension in the one-dimensional array; x is the number ofminAnd xmaxRespectively corresponding to the maximum value and the minimum value in the spliced and aggregated one-dimensional array.
3. The short-term data flow prediction method based on the long-term memory network model as claimed in claim 1, wherein the length of the sliding window is N, and when the data flow is a traffic data flow, the product of the length of the sliding window and the time interval T of every two adjacent time points satisfies the following relationship: NxT is less than or equal to 60.
4. The method for predicting short-term data flow based on short-term memory network model as claimed in claim 3, wherein in step S1, in step S1, when the data flow is traffic data flow, T is less than or equal to 30;
and when the observation point history is data flow counted every E minutes, T is E, 2E, …, (n-1) E or nE, wherein n and E are both a certain value, and nE is less than or equal to 30.
5. The short-term data flow prediction method based on a long-term and short-term memory network model as claimed in claim 4, wherein when T is Y minutes, the length N of the sliding window is an integer value of 2 to 60/Y.
6. The method for predicting short-term data flow based on long-term and short-term memory network model of claim 1, wherein in step S3, if the data flow is traffic data flow, the feature extraction process of the training samples is as follows: performing first-order difference processing on the training samples, and taking an absolute value of a result of each first-order difference as a characteristic of the training samples; wherein
When the training sample is [ x ]txt-1... xt-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1];
wherein xt、xt-1,...,xt-N+1Respectively corresponding to the data flow values counted at time points t, t-1, … and t-N + 1; n is the length of the sliding window;
the characteristics of the training sample obtained in step S3 are:
[|xt-xt-1|,|xt-1-xt-2|,...|xt-N+2-xt-N+1|];
in step S3, if the data flow is a network load data flow, the feature extraction process of the training sample is as follows: performing first-order difference processing on the training sample, and taking a first-order difference result of the training sample as the characteristic of the training sample; wherein
When the training sample is [ x ]txt-1... xt-N+1]And then, performing first-order difference processing to obtain the characteristics of the training sample as follows:
[xt-xt-1,xt-1-xt-2,...xt-N+2-xt-N+1]。
7. the method for predicting short-term data flow based on a long-term and short-term memory network model according to claim 1, wherein in step S3, according to the characteristics of the training samples, all the training samples are separated by K-means clustering to separate two types of training samples with severe data flow rate value variation trend and moderate data flow rate value variation trend, or two types of training samples with rising and falling data flow rate value variation trend.
8. The method for predicting a short-term data stream based on a long-term and short-term memory network model as claimed in claim 1, wherein in step S4, when initializing the LSTM model, a matrix of a specified dimension division is first generated, then singular value decomposition is performed to generate three matrices, i.e. a matrix U, a matrix Σ, and a matrix V, the matrix U is used as an initial value of a weight matrix of an input gate, a forgetting gate, an output gate, and candidate state values in the LSTM model hidden layer, and all bias vectors in the LSTM model are set to 0.
9. The short-term data flow prediction method based on the long-term and short-term memory network model according to claim 1, wherein in step S6, the classifier is a K-nearest neighbor classifier.
10. The method for predicting short-term data flow based on long-term and short-term memory network model of claim 1, wherein in step S6, if the data flow is traffic data flow, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the training sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]Then, after first-order difference processing is carried out, the following results are obtained:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1];
wherein xt′Data flow value counted for current time point t
xt′-1,...,xt′-N+1Corresponding to time points t, t-1, …, t, respectively-N +1 statistical data flow values; n is the length of the sliding window;
the characteristics of the test sample obtained in step S6 are:
[|xt′-xt′-1|,|xt′-1-xt′-2|,...|xt′-N+2-xt′-N+1|];
in step S6, if the data flow is a network load data flow, the feature extraction process of the test sample is as follows: performing first-order difference processing on the test sample, and taking a first-order difference result of the test sample as the characteristic of the test sample; wherein
When the obtained test sample is [ x ]t′xt′-1... xt′-N+1]And then, performing first-order difference processing to obtain the characteristics of the test sample as follows:
[xt′-xt′-1,xt′-1-xt′-2,...xt′-N+2-xt′-N+1]。
CN201711264618.9A 2017-12-05 2017-12-05 Short-time data flow prediction method based on long-time and short-time memory network model Active CN108062561B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711264618.9A CN108062561B (en) 2017-12-05 2017-12-05 Short-time data flow prediction method based on long-time and short-time memory network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711264618.9A CN108062561B (en) 2017-12-05 2017-12-05 Short-time data flow prediction method based on long-time and short-time memory network model

Publications (2)

Publication Number Publication Date
CN108062561A CN108062561A (en) 2018-05-22
CN108062561B true CN108062561B (en) 2020-01-14

Family

ID=62136080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711264618.9A Active CN108062561B (en) 2017-12-05 2017-12-05 Short-time data flow prediction method based on long-time and short-time memory network model

Country Status (1)

Country Link
CN (1) CN108062561B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033140B (en) * 2018-06-08 2020-05-29 北京百度网讯科技有限公司 Method, device, equipment and computer storage medium for determining search result
CN108900346B (en) * 2018-07-06 2021-04-06 西安电子科技大学 Wireless network flow prediction method based on LSTM network
CN109194498B (en) * 2018-07-27 2021-10-08 南京理工大学 Network traffic prediction method based on LSTM
CN109143105A (en) * 2018-09-05 2019-01-04 上海海事大学 A kind of state-of-charge calculation method of lithium ion battery of electric automobile
CN109120463B (en) * 2018-10-15 2022-01-07 新华三大数据技术有限公司 Flow prediction method and device
JP2021508096A (en) 2018-11-02 2021-02-25 アドバンスド ニュー テクノロジーズ カンパニー リミテッド Monitoring multiple system indicators
CN109462520B (en) * 2018-11-19 2021-12-10 电子科技大学 Network traffic resource situation prediction method based on LSTM model
CN109815785A (en) * 2018-12-05 2019-05-28 四川大学 A kind of face Emotion identification method based on double-current convolutional neural networks
CN110231976B (en) * 2019-05-20 2021-04-20 西安交通大学 Load prediction-based edge computing platform container deployment method and system
CN110390386B (en) * 2019-06-28 2022-07-29 南京信息工程大学 Sensitive long-short term memory method based on input change differential
CN110474808B (en) * 2019-08-20 2022-02-18 中国联合网络通信集团有限公司 Flow prediction method and device
CN110516041A (en) * 2019-08-28 2019-11-29 深圳勇艺达机器人有限公司 A kind of file classification method of interactive system
CN110855474B (en) * 2019-10-21 2022-06-17 广州杰赛科技股份有限公司 Network feature extraction method, device, equipment and storage medium of KQI data
CN111583628B (en) * 2020-03-27 2021-05-11 北京交通大学 Road network heavy truck traffic flow prediction method based on data quality control
CN111508230B (en) * 2020-04-16 2021-08-20 中国科学院自动化研究所 Time-interval traffic flow trend prediction method, system and device based on deep learning
CN111709549B (en) * 2020-04-30 2022-10-21 东华大学 SVD-PSO-LSTM-based short-term traffic flow prediction navigation reminding method
US11870863B2 (en) 2020-05-25 2024-01-09 Nec Corporation Method for operating a network
CN111815046B (en) * 2020-07-06 2024-03-22 北京交通大学 Traffic flow prediction method based on deep learning
CN112182954B (en) * 2020-09-08 2023-05-23 上海大学 LSTM-based fluid simulation data prediction model
CN112580260A (en) * 2020-12-22 2021-03-30 广州杰赛科技股份有限公司 Method and device for predicting water flow of pipe network and computer readable storage medium
CN112905958B (en) * 2021-01-27 2024-04-19 南京国电南自电网自动化有限公司 Short-time data window telemetry data state identification method and system based on measurement and control device
CN113944888B (en) * 2021-11-03 2023-12-08 北京软通智慧科技有限公司 Gas pipeline leakage detection method, device, equipment and storage medium
CN115410386B (en) * 2022-09-05 2024-02-06 同盾科技有限公司 Short-time speed prediction method and device, computer storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389980A (en) * 2015-11-09 2016-03-09 上海交通大学 Short-time traffic flow prediction method based on long-time and short-time memory recurrent neural network
WO2016156236A1 (en) * 2015-03-31 2016-10-06 Sony Corporation Method and electronic device
KR101742042B1 (en) * 2016-11-15 2017-05-31 한국과학기술정보연구원 Apparatus and method for traffic flow prediction
CN106960252A (en) * 2017-03-08 2017-07-18 深圳市景程信息科技有限公司 Methods of electric load forecasting based on long Memory Neural Networks in short-term
WO2017150032A1 (en) * 2016-03-02 2017-09-08 Mitsubishi Electric Corporation Method and system for detecting actions of object in scene

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016156236A1 (en) * 2015-03-31 2016-10-06 Sony Corporation Method and electronic device
CN105389980A (en) * 2015-11-09 2016-03-09 上海交通大学 Short-time traffic flow prediction method based on long-time and short-time memory recurrent neural network
WO2017150032A1 (en) * 2016-03-02 2017-09-08 Mitsubishi Electric Corporation Method and system for detecting actions of object in scene
KR101742042B1 (en) * 2016-11-15 2017-05-31 한국과학기술정보연구원 Apparatus and method for traffic flow prediction
CN106960252A (en) * 2017-03-08 2017-07-18 深圳市景程信息科技有限公司 Methods of electric load forecasting based on long Memory Neural Networks in short-term

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《深度学习框架下LSTM 网络在短期电力负荷预测中的应用》;陈亮 等;;《电力信息与通信技术》;20170730;第15卷(第5期);第8-11页; *
Han-Kai Hsu et al;.《Learning to Tell Brake and Turn Signals in Videos Using CNN-LSTM Structure》.《2017 IEEE 20th International Conference on Intelligent Transportation Systems》.2017, *

Also Published As

Publication number Publication date
CN108062561A (en) 2018-05-22

Similar Documents

Publication Publication Date Title
CN108062561B (en) Short-time data flow prediction method based on long-time and short-time memory network model
CN111314331B (en) Unknown network attack detection method based on conditional variation self-encoder
CN113515770A (en) Method and device for determining target business model based on privacy protection
CN106709588B (en) Prediction model construction method and device and real-time prediction method and device
CN108229724B (en) Short-term traffic data flow prediction method based on temporal-spatial information fusion
CN114006826B (en) Network traffic prediction method fusing traffic characteristics
JP6965206B2 (en) Clustering device, clustering method and program
US10902311B2 (en) Regularization of neural networks
CN109657600B (en) Video area removal tampering detection method and device
CN110633859A (en) Hydrological sequence prediction method for two-stage decomposition integration
JP2011059500A (en) Speaker clustering device and speaker clustering method
CN115801463B (en) Industrial Internet platform intrusion detection method and device and electronic equipment
CN115580445A (en) Unknown attack intrusion detection method, device and computer readable storage medium
CN108154186B (en) Pattern recognition method and device
CN113449905A (en) Traffic jam early warning method based on gated cyclic unit neural network
CN117041017A (en) Intelligent operation and maintenance management method and system for data center
US20210397956A1 (en) Activity level measurement using deep learning and machine learning
US11580362B2 (en) Learning apparatus, generation apparatus, classification apparatus, learning method, and non-transitory computer readable storage medium
CN113962160A (en) Internet card user loss prediction method and system based on user portrait
CN111160419B (en) Deep learning-based electronic transformer data classification prediction method and device
CN114584230B (en) Predictive channel modeling method based on countermeasure network and long-term and short-term memory network
US20220269988A1 (en) Abnormality degree calculation system and abnormality degree calculation method
CN113177078B (en) Approximate query processing algorithm based on condition generation model
CN114328921A (en) Small sample entity relation extraction method based on distribution calibration
Petrlik et al. Multiobjective selection of input sensors for svr applied to road traffic prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant