CN110188919A

CN110188919A - A kind of load forecasting method based on shot and long term memory network

Info

Publication number: CN110188919A
Application number: CN201910325295.2A
Authority: CN
Inventors: 许贤泽; 施元
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2019-04-22
Filing date: 2019-04-22
Publication date: 2019-08-30

Abstract

The invention belongs to Techniques for Prediction of Electric Loads fields, disclose a kind of load forecasting method based on shot and long term memory network, comprising: acquisition target area forms raw data set in the power load data and corresponding weather characteristics data of certain time period；Missing values processing is carried out to raw data set using Spark cluster；Feature selecting is carried out to raw data set；Feature Compression is carried out to raw data set；Establish the prediction model based on shot and long term memory network；Distributed training is carried out to prediction model using Spark cluster；According to the weather characteristics data of previous time point and power load data, Distributed Predictive is carried out using prediction model, obtains the predicted load of current point in time.The present invention can solve is difficult under big data scene the problem of fast and efficiently carrying out electro-load forecast in the prior art, effectively can quickly be extracted, be handled and operation to large data sets.

Description

A kind of load forecasting method based on shot and long term memory network

Technical field

The present invention relates to Techniques for Prediction of Electric Loads field more particularly to a kind of loads based on shot and long term memory network Prediction technique.

Background technique

Load prediction problem is about prediction electric power enterprise electric load needed for some specific time in the future, is electricity One of core content in network planning stroke.Electric power enterprise will be according to the historical data analysis of load and to Future Development trend The change conditions and development trend of electric load in a period of time from now on are forecast in judgement.One accurate load prediction is to electric power The short term scheduling arrangement of enterprise and long-term system planning are all vital, are that it works out power supply plan, development plan, money The basis of golden financial planning etc..

By the end of the year in 2017, new-energy grid-connected capacity reached 2.80 hundred million kilowatts in national grid scheduling range, apoplexy Electric 145,390,000 kilowatts, 120,830,000 kilowatts of solar power generation, rank first in the world；Intelligent electric meter is accumulative to be installed more than 400,000,000, It is basic to realize all standing of power information automatic collection.Data it is growing, load prediction enters big data era, and load is pre- Surveying with big data technological incorporation generation is imperative practice, has important strategic importance for the development of power industry.

In order to cope with mass data bring challenge, " big data " has expedited the emergence of a large amount of based on computer technology Distributed Parallel Computing and memory technology.The MapReduce programming framework and Google File developed with Google Based on System, the Hadoop project of Apache opens the epoch of enterprise-level big data processing, and what is come along is to surround The appearance of the various distributed computings, storage platform of Hadoop research and development.As the demand calculated in real time is higher and higher, with Spark Streaming, Flink are that the related Stream Processing engine of representative has started the new wave tide of big data development.

In order to solve the problems, such as under big data scene fast and efficiently electro-load forecast, need to propose that a whole set of is applicable in Load prediction data processing and modeling scheme in big data processing scene.

Summary of the invention

The embodiment of the present application solves existing by providing a kind of load forecasting method based on shot and long term memory network The problem of of fast and efficiently carrying out electro-load forecast, is difficult under big data scene in technology.

The embodiment of the present application provides a kind of load forecasting method based on shot and long term memory network, comprising the following steps:

Step S1, power load data and corresponding weather characteristics data of the acquisition target area in certain time period, shape At raw data set；

Step S2, missing values processing is carried out to the raw data set using Spark cluster；

Step S3, feature selecting is carried out to the raw data set；

Step S4, Feature Compression is carried out to the raw data set；

Step S5, the prediction model based on shot and long term memory network is established；

Step S6, distributed training is carried out to the prediction model using the Spark cluster；

Step S7, according to the weather characteristics data of previous time point and power load data, using the prediction model into Row Distributed Predictive obtains the predicted load of current point in time.

Preferably, the load forecasting method based on shot and long term memory network further include:

Step S8: real-time collected power load data and weather characteristics data are stored in Hbase cluster, And show predicted load and the real-time collection value of load.

Preferably, in the step S1, the raw data set of formation is stored into Hbase cluster.

Preferably, it in the step S2, after carrying out K- mean cluster to missing values using the Spark cluster, takes same The strategy of class mean value interpolation is handled.

Preferably, in the step S3, the Pearson for calculating the power load data and the weather characteristics data is closed Connection degree, and decision tree is promoted by training gradient and carries out feature importance ranking.

Preferably, the characteristic variable that conspicuousness is higher than 0.05 is rejected, relevancy ranking is carried out to all variables, before taking 30% feature is as fisrt feature collection；It is promoted after decision tree is trained using gradient and obtains characteristic variable importance ranking, Take preceding 30% feature as second feature collection；Take the intersection feature of the fisrt feature collection and the second feature collection as sieve Feature after choosing.

Preferably, in the step S4, Feature Compression is carried out to the weather characteristics data using principal component analysis.

Preferably, it in the step S5, is propagated using clean cut system along the direction of time and carries out parameter update.

Preferably, in the step S6, it is based on the Spark cluster, carries out data using asynchronous stochastic gradient descent method Parallel distribution training.

Preferably, in the step S7, the power load data and weather characteristics data of previous time point, root are obtained Degree of parallelism is set according to data volume and cluster hardware information, predicts load using the Spark cluster.

One or more technical solutions provided in the embodiments of the present application have at least the following technical effects or advantages:

In the embodiment of the present application, it by obtaining power load data set, the weather characteristics data set of history, and carries out Data prediction, feature selecting and Feature Compression；Then, net is remembered along the shot and long term that time reversal is propagated using clean cut system Network carries out asynchronous stochastic gradient descent distribution training, establishes the prediction model of power load amount, carries out the electricity consumption at a certain moment Amount prediction.The present invention effectively can quickly extract large data sets, handle and operation.

Detailed description of the invention

It, below will be to required use in embodiment description in order to illustrate more clearly of the technical solution in the present embodiment Attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is one embodiment of the present of invention, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 is at a kind of data of the load forecasting method based on shot and long term memory network provided in an embodiment of the present invention Platform；

Fig. 2 is shot and long term in a kind of load forecasting method based on shot and long term memory network provided in an embodiment of the present invention The topological diagram of memory network model；

Fig. 3 is clean cut system in a kind of load forecasting method based on shot and long term memory network provided in an embodiment of the present invention The schematic diagram propagated along time reversal；

Fig. 4 is that Spark is general in a kind of load forecasting method based on shot and long term memory network provided in an embodiment of the present invention The topological diagram of logical pattern drill；

Fig. 5 is Spark net in a kind of load forecasting method based on shot and long term memory network provided in an embodiment of the present invention The topological diagram of lattice pattern drill.

Specific embodiment

In order to better understand the above technical scheme, right in conjunction with appended figures and specific embodiments Above-mentioned technical proposal is described in detail.

A kind of load forecasting method based on shot and long term memory network provided in this embodiment, mainly comprises the steps that

Step S1: the associated weather of acquisition somewhere history power load data and same area, same time period Data form raw data set, and store as in Hbase cluster.

Historical data (i.e. raw data set) is stored in Hbase, it, can be with due to the good feature of Hbase horizontal extension The requirement for meeting big data quantity is met the operational requirements to data persistence and reading, while can be carried out using Hive Interactive query operation.

Step S2: missing values processing operation is carried out to data set using Spark cluster.

It is parallel using the calculate node in Spark cluster by providing relevant information to the driving node of Spark cluster Big data is loaded into the distributed memory of cluster system on ground and persistence, data are abstracted as elasticity point in Spark cluster Cloth data set simultaneously calculates based on memory, reduces the time overhead of data prediction.It is poly- based on K- mean value to data missing values Class carries out the processing of parallel similar mean value interpolation, is more nearly true value compared to common mean value interpolation.

Step S3: the feature selecting of data is carried out.

The pearson correlation degree of each weather characteristics data Yu power load data is calculated, and promotes decision tree using gradient Show that characteristic variable importance ranking, both joints carry out Feature Selection, farthest remain original number after being trained According to feature, reduce amount of training data.

Step S4: the Feature Compression of data is carried out.

Using principal component analysis compressed data, while retaining initial data feature to the maximum extent, algorithm model is reduced Input data amount, improve model calculating speed.

Step S5: the model based on shot and long term memory network is established.

The modeling that using shot and long term memory network compressed data are carried out with load prediction, can prevent " gradient Disappear " problem.It is propagated using clean cut system along the direction of time and carries out parameter update, can reduce that parameter in network updates answers The frequency of parameter update can be improved in miscellaneous degree, this method, the neural network so that same operational capability is quickly formed.

Step S6: distributed training pattern is carried out using Spark cluster.

Based on Spark cluster, using the data parallelism training method of asynchronous stochastic gradient descent, with Spark cluster mould Formula training network will be significantly reduced the data volume that parameter updates between node, to improve model training speed.

Step S7: according to the power load data of previous time point and weather characteristics data, remembered by the shot and long term of foundation Recall network model, carries out Distributed Predictive, obtain the predicted load of current point in time.

The power load data and weather characteristics data for reading last moment, are arranged reasonable degree of parallelism, use Spark cluster predicts load, improves predetermined speed.

Step S8: real-time collected power load data and weather data are stored in Hbase cluster, in Web End shows predicted load and the real-time collection value of load (i.e. true load value) by graphical interfaces.

In real-time collected newest power load data and weather characteristics data deposit Hbase cluster.It is true in real time Load value (i.e. the real-time collection value of load) is shown simultaneously with predicted load, so as to observation error and trend.

The present invention is described in further detail below.

Fig. 1 illustrates the structural block diagram of data processing platform (DPP) provided by the invention, comprising: provides bottom storage Distributed file system HDFS in Hadoop；It is that Hbase and Hive provide the Computational frame of data operating interface MapReduce；PC cluster frame Spark；Receive the message queue MQ of real time data.Wherein Hbase is distributed data Library, Hive are used to provide SQL formula data manipulation for relevant staff.

The scene of the comprehensive reality of the present invention, it is contemplated that the factors such as weather conditions, time factor, regional relevance pass through number Modeling is learned, under conditions of guaranteeing certain serious forgiveness and precision, to the power load amount at the following a certain area corresponding a certain moment It is predicted.

A kind of load forecasting method based on shot and long term memory network provided in this embodiment specifically includes that

Step S1: the associated weather of acquisition somewhere history power load data and same area, same time period Data are in Hbase cluster.

For original electricity data collection and weather data collection, it is previously stored in Hbase respectively, Hbase can be deposited The data set for storing up magnanimity, as a kind of NoSQL types of database of column memory-type, its data column can according to demand dynamically Ground increases, to meet the numerous load profile of dimension.

K- mean cluster is carried out to Meteorological Characteristics missing values and handles to obtain several different aggregates of data, in same aggregate of data Mean value interpolation is carried out, compared to common mean value interpolation, the mean value of interpolation will be closer to true value, thus the accuracy of lift scheme.

K- mean value is an iterative algorithm, it is assumed that we want data clusters into K group, method are as follows: select first K random points, referred to as cluster centre；For each of data set data, according to the distance apart from each central point, Data use feature vector 2- norm.It is associated with apart from nearest central point, with the associated institute of the same central point It is polymerized to one kind a little.Central point associated by the group, is moved to the position of average value by the average value for calculating each group.K- Mean value minimization problem is to minimize the sum of the distance between all data point and the cluster centre point associated by it, K- The cost function of mean value is as follows:

Wherein,It represents and characteristic vector x⁽ⁱ⁾Nearest cluster centre point, algorithm optimization target are exactly to find out to make Obtain the smallest c of cost function⁽¹⁾,…c^(m)And u₁,…,u_k。

Algorithm flow is as follows:

(1) K point is created at random as starting center；

(2) to each feature vector in Meteorological Characteristics data set, its distance relative to each center is calculated；

(3) feature vector is assigned to and its immediate center；

(4) for the grouping newly obtained, the vector center of each grouping is calculated；

It repeats process (2) and arrives (4), until algorithmic statement.

K- mean algorithm can not determine number of clusters, although excessive can make cost function smaller, will cause number According to over-fitting.Select a suitable number of clusters amount extremely important for the accuracy of mean value interpolation.It calculates under inhomogeneity quantity The error amount of Meteorological Characteristics data, select so that the maximum categorical measure of error fall off rate as classification difference classification according to According to.To ready-portioned weather characteristics classification, mean value interpolation is carried out respectively, to complete missing values processing work.

Step S3: the feature selecting of data is carried out.

Pearson's degree of association is calculated, promotes decision making algorithm training pattern using gradient, both joints carry out feature selecting, Feature fault-tolerance with higher.

Pearson correlation degree and conspicuousness are calculated, the skin of following formula calculated load amount and each weather characteristics is used The Ademilson degree of association:

Wherein,AndRespectively X_iSample and Y_iThe average value of sample.Feature of the conspicuousness higher than 0.05 is rejected to become Amount carries out relevancy ranking to all variables, takes preceding 30% feature as feature set A.

It is promoted after decision tree is trained using gradient and obtains characteristic variable importance ranking, preceding 30% feature is taken to make It is characterized collection B.Take the intersection feature of feature set A and feature set B as the feature after screening.

Step S4: the Feature Compression of data is carried out.

Using principal component analysis dimensionality reduction, the transition matrix W of d × k dimension will be constructed, it thus can be by a feature Vector x is mapped in a new k dimensional feature subspace, and the dimension in this space is less than original d dimensional feature space:

Principal Component Analysis Algorithm process is as follows:

(1) standardization is done to original d dimension data collection；

(2) sample covariance matrix is constructed；

(3) characteristic value and corresponding feature vector of covariance matrix are calculated；

(4) feature vector corresponding with preceding k maximum eigenvalue is selected, wherein k is the dimension (k < d) in new feature space；

(5) mapping matrix W is constructed by the one before feature vector；

(6) the d input data set tieed up is transformed by mapping matrix W by new k dimensional feature subspace.

Compressed time series data collection is split according to the ratio of 8:2 in Spark cluster, is split into Training characteristics data set and test data set two parts.Wherein characteristic data set is used for training pattern, and test data set is used to comment Estimate model.

Step S5: the Cyclic Operation Network based on shot and long term memory network is established.

Because of the following load per hour for 24 hours of prediction, the time step for choosing shot and long term memory network is 24, The sequence output of i.e. continuous load for 24 hours is used as a sample, and specific structure is shown in Fig. 2.Training characteristics are that compressed weather is special Data are levied, training label is the load value of future time point.Using shot and long term memory network structure, even if time step is longer, Be not in " gradient disappearance " and influence model training.

Shot and long term memory network the operational capability of sizing is required when handling longer sequence it is high, using clean cut system along when Between backpropagation fast shaping network.By length in Fig. 3 be 4 clean cut system along time reversal propagation for, each subparameter is more 4 time steps are all only passed through in new backpropagation, therefore can reduce the complexity that parameter updates in network.Input compared with The frequency of parameter update can be improved when long sequence data along time reversal propagation algorithm using clean cut system, so that same fortune Calculation ability is quickly formed neural network.

Step S6: distributed training is carried out to model using Spark cluster.

Use the distributed training side of the asynchronous stochastic gradient descent (asynchronous stochastic gradient descent) based on data parallel Case.Definition loss function is L, for n parameter, the gradient vector of loss function are as follows:

Parameter vector W is with learning rate a after SGD i+1 time iteration are as follows:

Wherein, W_iTo be after parameter vector i-th iteration as a result,The node data is trained for j-th of calculate node The gradient vector of resulting loss function after copy, n are the quantity of calculate node.

In asynchronous stochastic gradient descent, parameter updated valueCompletion Shi Caihui is calculated to be used in parameter vector, Period without following strictly parameter update.Asynchronous stochastic gradient descent can obtain higher gulp down in a distributed system The amount of spitting: working node can spend more times to execute useful calculating, rather than parameter averaging step is waited to complete.Secondly, Quickly merge the information from other working nodes when working node is than using synchronized update.

Distributed model training is carried out using asynchronous stochastic gradient descent based on Spark computing cluster, when in cluster Less than 32 nodes of number of nodes will use general mode, see Fig. 4；When cluster scale is larger, using network mode, Fig. 5 is seen. Parameter update between node is carried out coding compression by both training modes, to reduce inter-node traffic, is effectively improved Model training speed.

In the ordinary mode, the coding of quantization, which is updated, is transmitted to host node by working node, and then host node will more new biography It is multicast to remaining node.This can ensure that host node holds the latest edition of model always.Meanwhile by the fault-tolerant machine of Spark cluster System, it is ensured that the reliable communication of node, if Spark Master node breaks down, cluster can elect new Master node, to avoid single-point problem.

Mesh model is a multiway tree, and root node is Spark Master.Under default situations, each node is most There can be eight nodes, the node tree in Spark cluster can have up to five ranks.Under network mode, each node section Point, which updates coding, is relayed to all nodes connected to it, and each node aggregation is from being connected to its every other section The received update of point.Under mesh model, Master node is no longer the bottleneck of performance, because of its direct received traffic Reduce.

Step S7: according to the power load data and weather characteristics data of previous time point, pass through the shot and long term of foundation Memory network model carries out Distributed Predictive, obtains the load prediction amount of current point in time.

The power load data and weather characteristics data that last moment is read from Hbase cluster, use Spark collection Group's parallel anticipation load is arranged the reasonable data number of partitions and piecemeal size according to cluster situation, improves to the greatest extent pre- Degree of testing the speed.

Step S8: real-time collected power load data and weather data are stored in Hbase cluster, and The end Web is shown by graphical interfaces.

It collected load and weather characteristics data deposit Hbase will be used in real time for training pattern in the future.It will be pre- It surveys result and message queue is written, using the predicted load and true value in Network Programming Technology real-time reception message queue, with The mode of line chart is shown, so as to observation error and trend.

A kind of load forecasting method based on shot and long term memory network provided in an embodiment of the present invention includes at least following skill Art effect:

Based on the big datas processing technique such as Hadoop, Spark, similar mean value interpolation is done using K- mean cluster, combines skin The Ademilson degree of association and gradient promote decision tree Feature Selection and the data processing of principal component analysis dimensionality reduction compressive features data is grasped Make, considerably reduces the data volume of model training while utmostly retaining initial data feature, accelerate model training Speed.Distributed load modeling training is carried out to data using shot and long term memory network, compared to conventional individual training, greatly Accelerate training process.By saving on a distributed training pattern, by Spark computing cluster, complete Distributed parallel load prediction reduces predicted time expense.The present invention can be efficiently and accurately complete under big data scene At electro-load forecast.The present invention corresponds to certain for the moment under conditions of guaranteeing certain serious forgiveness and precision, to the following a certain area The power load amount at quarter is predicted, provides certain reference reference significance for the scheduling of power resources to relevant departments.

It should be noted last that the above specific embodiment is only used to illustrate the technical scheme of the present invention and not to limit it, Although being described the invention in detail referring to example, those skilled in the art should understand that, it can be to the present invention Technical solution be modified or replaced equivalently, without departing from the spirit and scope of the technical solution of the present invention, should all cover In the scope of the claims of the present invention.

Claims

1. a kind of load forecasting method based on shot and long term memory network, which comprises the following steps:

Step S1, acquisition target area is formed former in the power load data and corresponding weather characteristics data of certain time period Beginning data set；

Step S3, feature selecting is carried out to the raw data set；

Step S4, Feature Compression is carried out to the raw data set；

Step S7, according to the weather characteristics data of previous time point and power load data, divided using the prediction model Cloth prediction, obtains the predicted load of current point in time.

2. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that further include:

Step S8: real-time collected power load data and weather characteristics data are stored in Hbase cluster, and are shown Predicted load and the real-time collection value of load.

3. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S1, the raw data set of formation is stored into Hbase cluster.

4. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S2, after carrying out K- mean cluster to missing values using the Spark cluster, at the strategy of taking similar mean value interpolation Reason.

5. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S3, Pearson's degree of association of the power load data Yu the weather characteristics data is calculated, and promoted by training gradient Decision tree carries out feature importance ranking.

6. the load forecasting method according to claim 5 based on shot and long term memory network, which is characterized in that reject significant Property be higher than 0.05 characteristic variable, relevancy ranking is carried out to all variables, takes preceding 30% feature as fisrt feature collection；Make It is promoted after decision tree is trained with gradient and obtains characteristic variable importance ranking, take preceding 30% feature as second feature collection； Take the intersection feature of the fisrt feature collection and the second feature collection as the feature after screening.

7. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S4, Feature Compression is carried out to the weather characteristics data using principal component analysis.

8. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S5, is propagated using clean cut system along the direction of time and carry out parameter update.

9. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In S6, it is based on the Spark cluster, the distribution training of data parallel is carried out using asynchronous stochastic gradient descent method.

10. the load forecasting method according to claim 1 based on shot and long term memory network, which is characterized in that the step In rapid S7, the power load data and weather characteristics data of previous time point are obtained, according to data volume and cluster hardware information Degree of parallelism is set, predicts load using the Spark cluster.