CN114399101A - TCN-BIGRU-based gas load prediction method and device - Google Patents
TCN-BIGRU-based gas load prediction method and device Download PDFInfo
- Publication number
- CN114399101A CN114399101A CN202111658841.8A CN202111658841A CN114399101A CN 114399101 A CN114399101 A CN 114399101A CN 202111658841 A CN202111658841 A CN 202111658841A CN 114399101 A CN114399101 A CN 114399101A
- Authority
- CN
- China
- Prior art keywords
- data
- tcn
- bigru
- historical
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000012549 training Methods 0.000 claims abstract description 25
- 238000007781 pre-processing Methods 0.000 claims abstract description 13
- 238000012216 screening Methods 0.000 claims abstract description 12
- 230000001364 causal effect Effects 0.000 claims description 29
- 230000002457 bidirectional effect Effects 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 12
- 230000007774 longterm Effects 0.000 claims description 11
- 238000001914 filtration Methods 0.000 claims description 5
- 230000002441 reversible effect Effects 0.000 claims description 5
- 238000012512 characterization method Methods 0.000 claims 2
- 238000002360 preparation method Methods 0.000 claims 1
- 239000007789 gas Substances 0.000 description 40
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical compound C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 18
- 238000012545 processing Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 238000011156 evaluation Methods 0.000 description 9
- 239000003345 natural gas Substances 0.000 description 9
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 6
- 230000010339 dilation Effects 0.000 description 6
- 238000000605 extraction Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 3
- 241000288105 Grus Species 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 230000006578 abscission Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012731 temporal analysis Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000000700 time series analysis Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/049—Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Tourism & Hospitality (AREA)
- Development Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Educational Administration (AREA)
- Public Health (AREA)
- Water Supply & Treatment (AREA)
- Primary Health Care (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention provides a gas load prediction method and device based on TCN-BIGRU. The method comprises the steps of obtaining historical characteristic data; screening historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data; and constructing a TCN-BIGRU model, inputting training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model. In this way, the gas load can be accurately predicted through the TCN-BIGRU model, so that the operation efficiency of a gas company is improved, and the purchase cost is reduced.
Description
Technical Field
The present invention relates generally to the field of gas load prediction, and more particularly, to a TCN-BIGRU based gas load prediction method and apparatus.
Background
With the evolution of energy structures and environmental protection policies, the demand for natural gas in china is rapidly increasing. However, natural gas resources in China are extremely unbalanced in distribution and are severely mismatched in supply and demand. Especially, during peak periods of gas use such as winter, shortage of natural gas is almost inevitable. Therefore, the method has important significance for making a strategy of buying and selling natural gas by accurately predicting the consumption of the natural gas.
Natural gas load is affected by weather (e.g., temperature, humidity, atmospheric pressure, etc.) and social activities (e.g., economic development, population growth, industrial manufacturing, etc.), with holidays and seasons being the most important factors. The natural gas consumption has the characteristics of large fluctuation, high randomness, large time fluctuation and the like, so that the prediction task is difficult. Conventional time series analysis methods have been used to predict natural gas consumption, including moving averages, autoregressive moving integral averages, autoregressive moving averages, kalman filtering, and wavelet transforms. These methods may capture the linear relationship between influencing factors, but are weak in describing non-linear features.
Conventional artificial intelligence methods are also widely used, such as support vector regression, artificial neural networks, bayesian networks, matrix decomposition, and gaussian process regression. These methods can extract non-linear relationships between features and can process small sample data. However, when processing large amounts of data, they are affected by dimension disasters and have higher computational complexity. In order to solve these problems, a deep belief network, a superposition denoising autoencoder and a convolutional neural network based on a restricted boltzmann machine have been proposed, which have superior performance to the above methods. However, they require manual extraction of features, and it is difficult to extract the relationship between past time points and future time points. A Recurrent Neural Network (RNN) is proposed for processing time series, but it is difficult to remember long-term information. In addition, when the time interval is long, gradient disappearance and explosion occur. Long-short memory (LSTM) and gated round robin unit (GRU) effectively solve this problem. A gated recursion unit is improved based on a recurrent neural network that uses forgetting gates, input gates, and output gates to efficiently explore time series and improve prediction accuracy. However, the gated-round unit (GRU) can only extract features from a single direction.
Disclosure of Invention
According to the embodiment of the invention, a gas load prediction scheme based on TCN-BIGRU is provided. According to the scheme, the gas load can be accurately predicted through the TCN-BIGRU model, so that the operation efficiency of a gas company is improved, and the purchasing cost is reduced.
In a first aspect of the invention, a TCN-BIGRU based gas load prediction method is provided. The method comprises the following steps:
acquiring historical characteristic data;
screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data;
and constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
Further, the historical feature data comprises time-series data and non-time-series data; wherein, the time sequence data is the total daily load data in the historical data; the non-time sequence data are holiday data and weather data corresponding to the current day of the historical data.
Further, the screening the historical feature data includes:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value;
wherein the Pearson correlation coefficient is:
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation.
Further, the preprocessing the screened historical feature data includes:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data;
the highest temperature data and the lowest temperature data among the weather data are normally normalized.
Further, the TCN-BIGRU model comprises an input layer, a one-dimensional convolution layer, a causal expansion convolution layer, a BIGRU layer and an output layer which are sequentially arranged;
the input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer;
the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer;
the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer;
the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer;
and the output layer selects a full connection layer and is used for outputting the gas load predicted value of the next day according to the combined result of the time sequence characteristic and the non-time sequence characteristic.
Further, defining the loss function of the TCN-BIGRU model as an average value of absolute errors; the average of the absolute errors is:
wherein MAE is the average of absolute errors; m is the sum of days for predicting the next day gas quantity; y isiThe actual gas quantity of the ith day;gas quantity was predicted for day i.
Further, the time-series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
In a second aspect of the invention, a TCN-BIGRU based gas load prediction apparatus is provided. The device includes:
the acquisition module is used for acquiring historical characteristic data;
the preprocessing module is used for screening the historical characteristic data, preprocessing the screened historical characteristic data and taking the preprocessed historical characteristic data as training data;
and the model training module is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
In a third aspect of the invention, an electronic device is provided. The electronic device at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the first aspect of the invention.
In a fourth aspect of the invention, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of the first aspect of the invention.
It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of any embodiment of the invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
The above and other features, advantages and aspects of embodiments of the present invention will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters denote like or similar elements, and wherein:
FIG. 1 shows a flow diagram of a TCN-BIGRU based gas load prediction method according to an embodiment of the invention;
FIG. 2 shows a schematic structural diagram of a TCN-BIGRU model according to an embodiment of the invention;
FIG. 3 shows a block diagram of a TCN-BIGRU based gas load prediction device according to an embodiment of the present invention;
FIG. 4 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present invention;
here, 400 denotes an electronic device, 401 denotes a CPU, 402 denotes a ROM, 403 denotes a RAM, 404 denotes a bus, 405 denotes an I/O interface, 406 denotes an input unit, 407 denotes an output unit, 408 denotes a storage unit, and 409 denotes a communication unit.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
In the invention, the meteorological features, data features and economic features of the natural gas are analyzed. Further, the data is processed herein using a sliding window, intended to capture the actual variation and trend of fluctuations in gas properties. Next, the Temporal Convolutional Network (TCN) is used herein to extract hidden features and information that easily change the size of the receptive field. Finally, a Bi-GRU model is introduced to extract past and future features. The results show that the neural network based on the bidirectional gating recursive unit and the causal extension convolution model has good performance.
FIG. 1 shows a flow chart of a TCN-BIGRU based gas load prediction method of an embodiment of the present invention.
The method comprises the following steps:
and S101, acquiring historical characteristic data.
The historical feature data includes time series data and non-time series data.
The time series data is daily load total data in the history data. The daily load total data is the total load of each day in the historical data. The time sequence data is data sorted according to the sequence of time.
The non-time sequence data are holiday data and weather data corresponding to the current day of the historical data. The holiday data includes a working day and a non-working day, and the working day and the non-working day are represented by different identifiers, for example, the working day is represented as "0", and the non-working day is represented as "1". The weather data includes a temperature condition, a humidity condition, a weather condition, and the like corresponding to the current day of the calendar history data. Non-time series data is data that is not sorted in chronological order.
S102, screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data.
Firstly, the screening the historical feature data includes:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value; the threshold value is, for example, 0.5.
The Pearson correlation coefficient is:
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation; .
Through the screening, historical daily load total data, holiday data, highest temperature data and lowest temperature data are selected.
Secondly, the preprocessing the screened historical characteristic data comprises the following steps:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data; for example, weekdays and non-weekdays are represented by one-hot codes of "0" and "1".
The highest temperature data and the lowest temperature data among the weather data are normally normalized.
S103, building a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and taking the trained TCN-BIGRU model as a gas load prediction model. And predicting the gas load of the next day according to the gas load prediction model.
The TCN-BIGRU model, as shown in fig. 2, includes an input layer, a one-dimensional convolution layer, a causal expansion convolution layer, a BIGRU layer, and an output layer, which are sequentially arranged.
The input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer.
The input layer of the TCN-BIGRU model uses the preprocessed and feature-filtered data, i.e., the highest temperature, the lowest temperature, the holiday information, and the gas load value of the current time step as inputs. The input includes time-series data and non-time-series data.
In the present embodiment, a sliding window of a time step is set, and the sliding window size is set to 10. The time-series data input at this time are as follows:
the time-series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
And the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer.
In each one-dimensional convolutional layer, a window is used to process time series data of a time length, and a sliding window is used to capture a short-term fluctuation trend. The sequence segments can be learned within the window size to capture local trend features of the time series. After one-dimensional processing is carried out on the gas data, characteristics are extracted after one-dimensional processing. The process shortens the length of the one-dimensional time sequence and improves the operation efficiency.
And the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer. Specifically, data output by the one-dimensional convolution layer is input into the causal expansion convolution layer for feature extraction, the causal expansion convolution layer can effectively extract features of input data, hidden information and long-term time relation in the features can be extracted, feature dimensionality of the input data can be reduced, and operation efficiency is improved.
The causal dilation convolutional layer, i.e., the Time Convolutional Network (TCN), is an algorithm that processes time series. And a causal convolution, an expansion convolution and a residual module are introduced, so that the problem of long-term extraction of the time sequence is solved. The structure of the device is composed of the following three parts.
First, causal convolution (Causal Convo l ut ions)
For the input, the output at this time depends only on the current time and the past time. It is not dependent on future inputs, which means that causal convolutions are susceptible to historical data. This structure does not provide a better prediction for longer time sequences.
Second, dilation convolution (Di l ated Convo l entries)
In order to solve the problem that the causal convolution can only receive short-time historical information, the dilation convolution is introduced.
For a one-dimensional time series X ═ X0,x1,x2,x3…xt…xT) And filter f 0,1,2, n-1, timeThe dilation convolution operation H on a sequence is defined as follows:
where n represents the size of the filter, d represents the expansion factor, and f (i) represents the input.
By increasing the size and the dilation factor of the filter, the TCN can better perform feature extraction. The top layer of the TCN may accept a greater range of historical information input as the expansion factor increases. After adding the extended convolution, the range is significantly improved for the information input to the network compared to the previous receive field size.
Third, residual module (Res idua l B l cups)
In addition to continuously adjusting the filter size and the amplification factor size, the receive field size of the TCN can be enlarged by increasing the number of hidden layers.
Specifically, the structure comprises a causal expansion layer, a weight norm layer, an activation layer and an abscission layer. In particular, causal diffusion convolution is used to extract hidden information from the input. The hidden information represents information which cannot be obtained by directly observing data; the hidden information is extracted through causality expansion convolution to the characteristics of the input time sequence, and the expansion convolution and the characteristics which are mined by the residual error module and cannot be observed. WeightNorm is used to limit the weight range to vary the training speed. The active layer employs linear cells with good convergence effect (Relu) and uses attenuation to solve the overfitting problem of the network. The deeper causal expansion convolution network formed by the superposition of the residual modules can better extract features, so that each convolution of the output layer can extract more information from the input layer.
The branches of the remaining modules are used to perform the conversion operation. At the input, the remaining modules add a branch to perform the conversion to conform to the number of existing functions. The outputs of the remaining modules are defined as follows:
H(x)=F(x(h-1))+x(h-1)
x(h)=δ(H(X))
where F () denotes an activation operation and H () is a series of conversion operations.
Wherein, general RNN can only extract the last several characteristic relations of the predicted value and the input sequence, and the relation between the predicted value and a longer time sequence, namely the long-term time relation, can be extracted through a causal convolution network.
And the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer. Specifically, a bidirectional GRU mechanism is utilized, a forward GRU network structure and a reverse GRU network structure are used for learning an output vector of the TCN network, and the hidden layer state of the GRU layer at the t moment is remembered to be ht。
The GRU neural network is structurally simplified relative to the LSTM, making the parameters less convergent and easier to converge. The GRU comprises two gate control units, an updating gate and a resetting gate, and also comprises hidden states and candidate hidden states, zt represents the output of the updating gate at the time t, rt represents the output of the resetting gate at the time t, htAndrepresenting the output of the hidden state and the candidate hidden state, respectively, the expressions of GRU are as follows:
rt=σ(Wr*[ht-1,Xt]+br)
zt=σ(Wz*[ht-1,xt]+bz)
in the classical recurrent neural network, the transmission of states is developed from front to back in a single direction. When some devices perform data processing, the output at the present time is related not only to the previous state but also to the subsequent state. The bidirectional GRU is formed by superposing two GRUs up and down, the output is jointly determined by the states of the two GRUs, and the modeling capability of the time sequence of equipment operation in the degradation process can be better excavated.
And the output layer selects a full connection layer and is used for outputting a gas load predicted value at the next day, namely (t +1) according to the combined result of the time sequence characteristic and the non-time sequence characteristic and recording the predicted value as yt。
The data transfer process of the TCN-BIGRU model comprises the following steps:
extracting local trend characteristics of the time sequence through the one-dimensional convolution layer from the time sequence data subjected to data preprocessing; then the extracted local trend characteristics are used for further extracting hidden information and long-term time relation in the characteristics through a causal expansion convolutional layer; then, inputting the characteristics extracted by the causal expansion convolution layer into a BIGRU layer for better learning bidirectional time sequence characteristics; and finally, combining the features and the non-time sequence features learned by the BIGRU layer, and inputting the combined features and the non-time sequence features into the full connection layer to obtain the final output.
Further, the penalty function of the TCN-BIGRU model is defined as the Mean of Absolute Error (MAE), which reflects the actual situation of predicted value error. The average of the absolute errors is:
wherein MAE is the average of absolute errors; m is the sum of days for predicting the next day gas quantity; y isiThe actual gas quantity of the ith day;gas quantity was predicted for day i.
Due to the large amount of gas load data and the long time span, a large amount of abnormal data may occur. However, MAE is very robust to outlier data. Meanwhile, the dynamic learning rate can effectively improve the defect of MAE fixed gradient.
Further, the output result of the gas load prediction model, namely the TCN-BIGRU model, is used as the gas load prediction value of the next day.
As an embodiment of the present invention, for more complete evaluation of the prediction performance of the method, the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE), the Mean Absolute Percentage Error (MAPE) and the R can be selected2To evaluate the performance of the model. The definition method of the statistical index is as follows:
wherein N represents the number of instances tested, zwAndrepresenting the actual load value and the predicted load value in the w-th case, each of the evaluation indexes having different advantages and disadvantages, wherein the RMSE measures the accuracy of the model by comparing the deviation between the predicted load value and the actual load value, the measurement method is maintained in the same dimension with the load, but is easily influenced by the error value because of being very sensitive to the error of the larger value and the smaller value, the MAE represents the average absolute error between the predicted load value and the actual load value, and the index shows better abnormal values than the RMSEBut it does not fully reflect the degree of prediction bias. MAPE represents the accuracy of the model by calculating a percentage of absolute error that takes into account the relative error between the predicted load value and the actual load value, but cannot be used when the actual load value is 0. R2The regression result is reduced to between 0 and 1, and the value is closer to 1, which means that the model is better, so that different models can be compared more conveniently. Therefore, in view of the above-mentioned evaluation indexes, it is necessary to evaluate the prediction performance of the model by integrating a plurality of evaluation indexes. The evaluation results of the predicted load values based on the evaluation indexes are shown in table 1:
TABLE 1
As can be seen from Table 1, the TCN-BiGRU model is superior to other models in each evaluation index, and in RMSE, compared with the TCN model, the GRU model and the BiGRU model, the model provided by the invention has better effect, and compared with other indexes, the evaluation index of MAE is respectively reduced by 5.98, 6.52 and 2.68 percentage points, thereby proving that the method provided by the invention has greater improvement on the gas load prediction accuracy.
In summary, according to the embodiments of the present invention, based on the causal dilation convolution and the depth learning method of the bidirectional GRU, in addition, for the problem that the short-term gas load prediction is more random and difficult to predict, the present invention adopts a sliding window with a fixed length to reconstruct the gas data, then uses the TCN to increase the time acceptance domain so as to better extract hidden features, inputs the output of the TCN into the bidirectional GRU to perform feature extraction from the forward direction and the reverse direction, and finally adopts a full connection layer to output the final load prediction result. In addition, the causal expansion convolution and the bidirectional gating circulation unit are well combined together, hidden features can be well extracted, the short-term gas load prediction accuracy is further improved, and in the comparison process of the traditional methods, the method has good scores and high accuracy under different evaluation indexes.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary and alternative embodiments and that the acts and modules illustrated are not necessarily required to practice the invention.
The above is a description of an embodiment of the method, and the following is a further description of the solution of the present invention by an embodiment of the apparatus.
As shown in fig. 3, the apparatus 300 includes:
an obtaining module 310, configured to obtain historical feature data;
the preprocessing module 320 is configured to screen the historical feature data, preprocess the screened historical feature data, and use the preprocessed historical feature data as training data;
the model training module 330 is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and taking the trained TCN-BIGRU model as a gas load prediction model; and predicting the gas load of the next day according to the gas load prediction model.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
In the technical scheme of the invention, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations without violating the good customs of the public order.
The invention also provides an electronic device and a readable storage medium according to the embodiment of the invention.
FIG. 4 shows a schematic block diagram of an electronic device 400 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not intended to limit implementations of the inventions described and/or claimed herein.
The device 400 comprises a computing unit 401 which may perform various suitable actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data required for the operation of the device 400 can also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
A number of components in device 400 are connected to I/O interface 405, including: an input unit 406 such as a keyboard, a mouse, or the like; an output unit 407 such as various types of displays, speakers, and the like; a storage unit 408 such as a magnetic disk, optical disk, or the like; and a communication unit 409 such as a network card, modem, wireless communication transceiver, etc. The communication unit 409 allows the device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/acts specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of the present invention, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server combining a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions are possible, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A gas load prediction method based on TCN-BIGRU is characterized by comprising the following steps:
acquiring historical characteristic data;
screening the historical characteristic data, preprocessing the screened historical characteristic data, and taking the preprocessed historical characteristic data as training data;
and constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
2. The method of claim 1, wherein the historical characterization data comprises time series data and non-time series data; the time sequence data is total daily load data in the historical data; the non-time sequence data are holiday data and weather data corresponding to the current day of the historical data.
3. The method of claim 2, wherein the filtering the historical feature data comprises:
screening out historical characteristic data with the Pearson correlation coefficient larger than a threshold value;
wherein the Pearson correlation coefficient is:
wherein p isX,YIs the Pearson correlation coefficient; x is weather data of historical data on the day; y is the total daily load in the historical data; e (.) indicates expectation.
4. The method of claim 2, wherein the pre-processing the filtered historical characterization data comprises:
normalizing the data of the total daily load data in the historical data;
carrying out one-hot coding on the holiday data corresponding to the current historical data;
the highest temperature data and the lowest temperature data among the weather data are normally normalized.
5. The method of claim 1, wherein the TCN-BIGRU model comprises an input layer, a one-dimensional convolutional layer, a causal expansion convolutional layer, a BIGRU layer, and an output layer arranged in this order;
the input layer is used for filtering the time series data by setting a sliding window and outputting the filtered time series data to the one-dimensional convolutional layer;
the one-dimensional convolutional layer is used for extracting local trend characteristics of the filtered time series data and outputting the local trend characteristics to the causal expansion convolutional layer;
the causal expansion convolutional layer is used for extracting hidden information and long-term time relation in the features and outputting the hidden information and the long-term time relation to the BIGRU layer;
the BIGRU layer learns the output vector of the causal expansion convolution layer by using a forward GRU network structure and a reverse GRU network structure to obtain a bidirectional time sequence characteristic, combines the bidirectional time sequence characteristic with a non-time sequence characteristic and inputs the bidirectional time sequence characteristic into the output layer;
and the output layer selects a full connection layer and is used for outputting the gas load predicted value of the next day according to the combined result of the time sequence characteristic and the non-time sequence characteristic.
6. The method of claim 5, wherein the penalty function for the TCN-BIGRU model is defined as an average of absolute errors; the average of the absolute errors is:
7. The method of claim 2, wherein the time series data is:
x1=[xt-T+1,xt-T+2,...,xt]T
wherein x is1Is time series data; t is any time; t is a sliding window;
x2=[Qmax(s),Qmin(s),I(s),i(s)]
wherein x is2Is non-time series data; qmax(s) predicting the highest temperature on the current day; qmin(s) is the lowest temperature predicted for the day; i(s) is a working day indication function, and if the current day is predicted to be a working day, I(s) is 1; i(s) is a non-workday indication function, and if the current day is predicted to be a non-workday, i(s) is 0.
8. A TCN-BIGRU-based gas load prediction device is characterized by comprising:
the acquisition module is used for acquiring historical characteristic data;
the preprocessing module is used for screening the historical characteristic data, preprocessing the screened historical characteristic data and taking the preprocessed historical characteristic data as training data;
and the model training module is used for constructing a TCN-BIGRU model, inputting the training data into the TCN-BIGRU model, training the TCN-BIGRU model, and predicting the gas load of the next day by taking the trained TCN-BIGRU model as a gas load prediction model.
9. An electronic device, at least one processor; and
a memory communicatively coupled to the at least one processor; it is characterized in that the preparation method is characterized in that,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111658841.8A CN114399101A (en) | 2021-12-30 | 2021-12-30 | TCN-BIGRU-based gas load prediction method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111658841.8A CN114399101A (en) | 2021-12-30 | 2021-12-30 | TCN-BIGRU-based gas load prediction method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114399101A true CN114399101A (en) | 2022-04-26 |
Family
ID=81228281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111658841.8A Pending CN114399101A (en) | 2021-12-30 | 2021-12-30 | TCN-BIGRU-based gas load prediction method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114399101A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117454124A (en) * | 2023-12-26 | 2024-01-26 | 山东大学 | Ship motion prediction method and system based on deep learning |
CN118153901A (en) * | 2024-04-09 | 2024-06-07 | 成都秦川物联网科技股份有限公司 | Intelligent gas pipe network gas supply allocation method, internet of things system and medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111815035A (en) * | 2020-06-22 | 2020-10-23 | 国网上海市电力公司 | Short-term load prediction method fusing morphological clustering and TCN-Attention |
-
2021
- 2021-12-30 CN CN202111658841.8A patent/CN114399101A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111815035A (en) * | 2020-06-22 | 2020-10-23 | 国网上海市电力公司 | Short-term load prediction method fusing morphological clustering and TCN-Attention |
Non-Patent Citations (2)
Title |
---|
LIANG LI,ET AL: ""Temporal Attention Based TCN-BIGRU Model for Energy Time Series Forecasting"", 《2021 IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE AND ELECTRONIC ENGINEERING (CSAIEE)》, pages 187 - 192 * |
郭玲 等: ""基于TCN-GRU模型的短期负荷预测方法"", 《电力工程技术》, vol. 40, no. 3, pages 66 - 71 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117454124A (en) * | 2023-12-26 | 2024-01-26 | 山东大学 | Ship motion prediction method and system based on deep learning |
CN117454124B (en) * | 2023-12-26 | 2024-03-29 | 山东大学 | Ship motion prediction method and system based on deep learning |
CN118153901A (en) * | 2024-04-09 | 2024-06-07 | 成都秦川物联网科技股份有限公司 | Intelligent gas pipe network gas supply allocation method, internet of things system and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Ma et al. | A hybrid attention-based deep learning approach for wind power prediction | |
Bin et al. | Regression model for appraisal of real estate using recurrent neural network and boosting tree | |
CN114399101A (en) | TCN-BIGRU-based gas load prediction method and device | |
Yu et al. | Improved convolutional neural network‐based quantile regression for regional photovoltaic generation probabilistic forecast | |
CN115587666A (en) | Load prediction method and system based on seasonal trend decomposition and hybrid neural network | |
CN111738331A (en) | User classification method and device, computer-readable storage medium and electronic device | |
CN114363195A (en) | Network flow prediction early warning method for time and spectrum residual convolution network | |
CN111985719A (en) | Power load prediction method based on improved long-term and short-term memory network | |
CN116485031A (en) | Method, device, equipment and storage medium for predicting short-term power load | |
CN116757465A (en) | Line risk assessment method and device based on double training weight distribution model | |
CN114266602A (en) | Deep learning electricity price prediction method and device for multi-source data fusion of power internet of things | |
Zhang et al. | Collaborative Forecasting and Analysis of Fish Catch in Hokkaido From Multiple Scales by Using Neural Network and ARIMA Model | |
CN113033903A (en) | Fruit price prediction method, medium and equipment of LSTM model and seq2seq model | |
CN116885699A (en) | Power load prediction method based on dual-attention mechanism | |
CN115759751A (en) | Enterprise risk prediction method and device, storage medium, electronic equipment and product | |
CN115545319A (en) | Power grid short-term load prediction method based on meteorological similar day set | |
CN115759343A (en) | E-LSTM-based user electric quantity prediction method and device | |
Sun et al. | Short-term stock price forecasting based on an SVD-LSTM model | |
Liu | Stock prediction using lstm and gru | |
CN114861800B (en) | Model training method, probability determining device, model training equipment, model training medium and model training product | |
Wang et al. | A-ConvRNN: A Prediction Model for E-Commerce Page Views Based on Convolutional Neural Network and Attention Mechanism | |
CN115759373A (en) | Gas daily load prediction method, device and equipment | |
US20230419128A1 (en) | Methods for development of a machine learning system through layered gradient boosting | |
CN115689036A (en) | Gas daily load prediction method based on Prophet-BIGRU | |
CN116307159A (en) | Load prediction method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |