CN111291782B - Accumulated load prediction method based on information accumulation k-Shape clustering algorithm - Google Patents

Accumulated load prediction method based on information accumulation k-Shape clustering algorithm Download PDF

Info

Publication number
CN111291782B
CN111291782B CN202010032213.8A CN202010032213A CN111291782B CN 111291782 B CN111291782 B CN 111291782B CN 202010032213 A CN202010032213 A CN 202010032213A CN 111291782 B CN111291782 B CN 111291782B
Authority
CN
China
Prior art keywords
prediction
load
cumulative
shape
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010032213.8A
Other languages
Chinese (zh)
Other versions
CN111291782A (en
Inventor
张宇帆
艾芊
王历晔
于琪
刘育权
熊文
王莉
蔡莹
吴任博
李俊格
黄开艺
余志文
张扬
李诗颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Guangzhou Power Supply Bureau Co Ltd
Original Assignee
Shanghai Jiaotong University
Guangzhou Power Supply Bureau Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University, Guangzhou Power Supply Bureau Co Ltd filed Critical Shanghai Jiaotong University
Priority to CN202010032213.8A priority Critical patent/CN111291782B/en
Publication of CN111291782A publication Critical patent/CN111291782A/en
Application granted granted Critical
Publication of CN111291782B publication Critical patent/CN111291782B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/231Hierarchical techniques, i.e. dividing or merging pattern sets so as to obtain a dendrogram
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • HELECTRICITY
    • H02GENERATION; CONVERSION OR DISTRIBUTION OF ELECTRIC POWER
    • H02JCIRCUIT ARRANGEMENTS OR SYSTEMS FOR SUPPLYING OR DISTRIBUTING ELECTRIC POWER; SYSTEMS FOR STORING ELECTRIC ENERGY
    • H02J3/00Circuit arrangements for ac mains or ac distribution networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Primary Health Care (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Power Engineering (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an accumulated load prediction method based on an information accumulation k-Shape clustering algorithm, which comprises the following steps: performing k-Shape clustering according to the Shape characteristics of the electrical load curve; then converting the load curve into a similarity matrix and a distance matrix of the load curve among the users; obtaining a hierarchical structure describing distances between each user on the distance matrix; selecting different clustering numbers to obtain different cluster partitions for the user, training a learning model, and performing probabilistic and deterministic prediction on the cumulative load of the user; the probabilistic and deterministic predictions of each cluster partition are weighted to predict the cumulative load forecast and combined into the final cumulative load forecast. The invention provides the shape information which covers the user electric load more comprehensively without depending on the extraction characteristics; the description of the electricity utilization characteristics of the user is facilitated; integrated learning of cumulative load prediction is achieved, as well as improvements in probabilistic and deterministic prediction accuracy.

Description

Accumulated load prediction method based on information accumulation k-Shape clustering algorithm
Technical Field
The invention relates to the technical field of load prediction, in particular to an accumulated load prediction method based on an information accumulation k-Shape clustering algorithm.
Background
With the deregulation of the electricity industry, load aggregators (agents that aggregate a series of users equipped with smart meters) are becoming an important participant in demand-side management. The cumulative load forecast provides a basis for the load aggregator decision-making process. Currently, methods for cumulative load prediction can be divided into three categories: 1) a method of complete polymerization; 2) a completely dispersed method; 3) a clustering based approach. The fully aggregated approach will overlay the cumulative load of all users and then load predict it. In contrast, the fully decentralized method predicts the load separately and then accumulates the prediction results. The clustering-based approach first divides the user load into several clusters. The sum of the loads on each cluster is then predicted separately, and the predicted loads for each cluster are then accumulated to form a final prediction result.
The application of a cluster-based cumulative load prediction method has been the focus of many studies. As a first step in this approach, it is crucial to select a suitable clustering method. As a Shape-based time series clustering method, the k-Shape shows the performance superior to other clustering methods in the fields of load prediction, energy management, accumulated load prediction and the like. Selecting the appropriate input features for the clustering algorithm is another important issue. So far, most documents take Representative Load Patterns (RLP) of power consumers as a clustering input feature and group load curves of the consumers. Average load data over a period of time is a typical type of RLP, however, such RLP curves do not reflect other statistical characteristics of load, other than the average statistics.
The existing cumulative load prediction method has the following problems, and brings unprecedented challenges for a load aggregator to make accurate decisions:
1) the single power user load has large fluctuation, and a prediction method based on complete dispersion has a series of problems of low prediction precision and the like because uncertainty is difficult to process.
2) The current load prediction method based on clustering usually depends on a single prediction result, and the improvement of prediction precision possibly brought by reasonably combining different results is neglected.
3) The input characteristics of the current clustering algorithm are the average value of loads in a period of time, and other statistical characteristics of the loads except the average value cannot be embodied.
Therefore, those skilled in the art are dedicated to developing an accumulated load prediction method based on an information accumulation k-Shape clustering algorithm, and realizing accumulated load prediction at a load aggregator based on data driving.
Disclosure of Invention
In view of the above defects in the prior art, the technical problem to be solved by the present invention is how to provide an accumulated load prediction method based on an information accumulation k-Shape clustering algorithm, which is based on data driving to realize the accumulated load prediction at a load aggregator.
In order to achieve the above object, the present invention provides an accumulated load prediction method based on an information accumulation k-Shape clustering algorithm, which is characterized in that the method comprises the following steps:
step 1, executing k-Shape clustering according to the Shape characteristics of the power load curve of a user;
step 2, converting the division of the load curve into a similarity matrix of the load curve among the users through the information obtained by the combined clustering;
step 3, converting the similarity matrix into a distance matrix;
step 4, applying a hierarchical clustering algorithm based on a single relation on the distance matrix to obtain a hierarchical structure describing the distance between each user;
step 5, selecting different clustering numbers according to the hierarchical structure obtained in the step 4 to obtain different cluster partitions of the user, training a learning model, and performing probabilistic prediction and deterministic prediction on the cumulative load of the user;
and 6, determining the weight of the cumulative load prediction results of the probabilistic prediction and the deterministic prediction of each cluster partition, and combining the cumulative load prediction results of the probabilistic prediction and the deterministic prediction of each cluster partition into a final cumulative load prediction result.
Further, the step 1 specifically includes the following steps:
step 1.1, representing the electrical load curve as a training set
Figure BDA0002364735430000021
The load data set of user i is represented as
Figure BDA0002364735430000022
Wherein N is the number of power consumers, m is the length of the load sequence, N tr Is the size of the training set;
and 1.2, solving an NP-hard optimization problem in a heuristic mode to realize k-Shape clustering.
Further, the specific formula of the NP-hard optimization problem of step 1.2 is:
Figure BDA0002364735430000023
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002364735430000024
is a cluster p j E.g. the center of mass of P,
Figure BDA0002364735430000025
is about a sequence of length m
Figure BDA0002364735430000026
Is measured by the shape distance of (a).
Further, the
Figure BDA0002364735430000027
The concrete formula of (1) is as follows:
Figure BDA0002364735430000028
wherein the content of the first and second substances,
Figure BDA0002364735430000029
is a measure of the cross-correlation of sequences, w 1.., 2 m-1;
Figure BDA00023647354300000210
Figure BDA00023647354300000211
further, the step 2 specifically includes the following steps:
step 2.1, after the clustering is finished, calculating each user uj In each cluster pi And is noted as the number of loads contained in
Figure BDA00023647354300000212
Wherein i 1.. k, j 1.. N;
step 2.2, defining the similarity matrix
Figure BDA0002364735430000031
Is composed of
Figure BDA0002364735430000032
Further, the step 3 specifically includes the following steps:
converting the similarity matrix into the distance matrix by D ═ I-S
Figure BDA0002364735430000033
Wherein the matrix
Figure BDA0002364735430000034
Is 1.
Further, the hierarchical structure in the step 4 is described by using a clustering tree diagram.
Further, the step 5 specifically includes the following steps:
by selecting different numbers of clusters
Figure BDA0002364735430000035
Get | N for N users C L, different cluster division modes;
for the ith cluster partition, training is required
Figure BDA0002364735430000036
A model, the first
Figure BDA0002364735430000037
A model f i,j Training on the jth cluster, i.e.
Figure BDA0002364735430000038
Wherein n is i,j Is the number of users on the jth cluster in the ith partition;
for probabilistic prediction, the probabilistic prediction value for the quantile q is expressed as:
Figure BDA0002364735430000039
for deterministic prediction, the model is trained
Figure BDA00023647354300000310
To obtain
Figure BDA00023647354300000311
The predicted result of (2);
for the ith cluster partition, the cumulative load prediction for probabilistic prediction is expressed as
Figure BDA00023647354300000312
The cumulative load prediction for deterministic prediction is represented as
Figure BDA00023647354300000313
Further, the weight of the cumulative load prediction result of the probabilistic prediction of each cluster partition determined in the step 6 is specifically an optimization problem:
Figure BDA00023647354300000314
Figure BDA00023647354300000315
Figure BDA00023647354300000316
w i,q ≥0
wherein the content of the first and second substances,
Figure BDA00023647354300000317
the load is the result of the cumulative load prediction of the ith clustering mode in the time l; the objective function is to minimize pinball loss function on the validation set; the pinball loss function is:
Figure BDA00023647354300000318
wherein N is va Is the number of samples on the validation set.
Further, the weight of the cumulative load prediction result of the deterministic prediction of each cluster partition determined in step 6 is specifically an optimization problem:
Figure BDA0002364735430000041
Figure BDA0002364735430000042
Figure BDA0002364735430000043
Figure BDA0002364735430000044
wherein the content of the first and second substances,
Figure BDA0002364735430000045
the objective function is to minimize the MAPE value on the validation set.
The invention has the beneficial effects that:
1. the invention provides a clustering method based on information accumulation, which can more comprehensively cover the shape information of the user electric load without depending on extracted features. In addition, the method can form a tree diagram for describing the hierarchical correlation of the user electricity utilization, and is beneficial to describing the electricity utilization characteristics of the user.
2. Compared with the method relying on single prediction, the method provided by the invention realizes the integrated learning of the cumulative load prediction and improves the probability and the certainty prediction accuracy.
3. At present, large-scale implementation or operation is not available at home and abroad, and the method has stronger innovation and operability.
Drawings
FIG. 1 is a comparison of deterministic predictions for method P-10 of a preferred embodiment of the present invention and prior art method C-4;
FIG. 2a is a probabilistic prediction result of the method P-10 according to a preferred embodiment of the present invention;
FIG. 2b is a probabilistic prediction result of prior art method C-4;
FIG. 3 is a probabilistic prediction and deterministic prediction weight heatmap of a preferred embodiment of the present invention;
FIG. 4 is a 10-Shape clustering tree based on information accumulation according to a preferred embodiment of the present invention;
FIG. 5 is a statistical information analysis of the clustering results according to a preferred embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings for clarity and understanding of technical contents. The present invention may be embodied in many different forms of embodiments and the scope of the invention is not limited to the embodiments set forth herein.
The method improves the accumulated load prediction based on k-Shape clustering, and improves the prediction precision by adopting an ensemble learning method; and providing a k-Shape clustering algorithm based on information accumulation to realize hierarchical division of the electricity utilization behaviors of the user. Aiming at probabilistic forecasting (probabilistic forecasting), determining and constructing a linear programming problem with the minimum pinball loss function as a target function according to the weights of different forecasting results so as to realize the optimal reliability of the probabilistic forecasting; for deterministic prediction (deterministic prediction), a linear programming problem with the minimum Mean Absolute Percentage Error (MAPE) loss function as a target function is determined and constructed according to the weights of different prediction results, so that the accuracy of the deterministic prediction is optimal.
The method mainly comprises two parts, namely a k-Shape clustering algorithm based on information accumulation and cumulative prediction aiming at probabilistic and deterministic loads.
(1) k-Shape clustering algorithm based on information accumulation
1) Load Shape information mining based on k-Shape clustering algorithm
This step is intended to group all of the daily load curves of the user according to their shape characteristics. In the training set
Figure BDA0002364735430000051
And performing k-Shape clustering, wherein N is the number of power users. The load data set of user i can be represented as
Figure BDA0002364735430000052
Where m is the length of the payload sequence, N tr Is the size of the training set.
Similar to other centroid-based clustering methods, k-Shape clustering aims at solving the following NP-hard optimization problem in a heuristic manner:
Figure BDA0002364735430000053
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0002364735430000054
is a cluster p j E.g. the center of mass of P,
Figure BDA0002364735430000055
defined by the following formula, which is a sequence of length m
Figure BDA0002364735430000056
Measure of shape distance of (a):
Figure BDA0002364735430000057
Figure BDA0002364735430000058
measure the cross-correlation of sequences, which canIs defined as
Figure BDA0002364735430000059
Wherein
Figure BDA00023647354300000510
Calculated from the following formula:
Figure BDA00023647354300000511
2) combining clustered information
After the clustering is completed, the load number of each user in each clustering cluster is calculated and recorded as
Figure BDA00023647354300000512
Wherein i 1.. k, j 1.. N. Then we pair the similarity matrices
Figure BDA00023647354300000513
The following definitions are made:
Figure BDA00023647354300000514
s (p, q) comprehensively describes the degree of similarity of load curves between users p and q, and thus, by combining the clustered information, the division of the load curves can be converted into a measure of similarity between users. Next, the similarity matrix is converted into a distance matrix by D ═ I-S
Figure BDA00023647354300000515
Wherein the matrix
Figure BDA00023647354300000516
Is 1.
The invention applies a hierarchical clustering algorithm based on single relation on the distance matrix. Thus, a hierarchy is obtained that characterizes the distance between each user and can be described using a tree graph.
(2) Ensemble learning based cumulative load prediction
1) Training phase
The purpose of the stage is to train a learning model so as to realize probabilistic and deterministic prediction of the cumulative load containing N power utilization users, and to realize ensemble learning by adopting a dendrogram which is obtained in a clustering stage and describes the hierarchical relationship of the users so as to improve the accuracy of the prediction. By selecting different numbers of clusters
Figure BDA00023647354300000517
The cluster division of different N power users is obtained, so | N can be obtained C And | different cluster division modes. For the ith partition, training is required
Figure BDA00023647354300000518
A model, the first
Figure BDA00023647354300000519
A model f i,j Training on the jth cluster, i.e.
Figure BDA00023647354300000520
Wherein n is i,j Is the number of users on the jth cluster in the ith partition. Thus, the probabilistic predictor of quantile q can be expressed as:
Figure BDA0002364735430000061
similarly, for deterministic prediction, the model is trained
Figure BDA0002364735430000062
Can obtain
Figure BDA0002364735430000063
The predicted result of (1).
Thus, probabilistic and deterministic cumulative load prediction for ith cluster partitioning can be expressed as
Figure BDA0002364735430000064
And
Figure BDA0002364735430000065
2) ensemble learning phase
The ensemble learning phase aims to determine the weights of the cumulative load predictors for each cluster partition and combine them into the final predictor. For probabilistic and deterministic predictions, weight determination is constructed as an optimization problem that is performed on the verification set.
For probabilistic predictions, the objective function is to minimize the pinball loss function:
Figure BDA0002364735430000066
wherein N is va The number of samples in the verification set is determined, and therefore, the following optimization problem is constructed for each quantile:
Figure BDA0002364735430000067
Figure BDA0002364735430000068
Figure BDA0002364735430000069
w i,q ≥0
wherein
Figure BDA00023647354300000610
The cumulative load prediction result of the ith clustering mode in the time l is obtained.
For the probabilistic prediction of quantile q, according to the proof of the existing literature, auxiliary variables are introduced
Figure BDA00023647354300000611
The above optimization problem can be converted into a linear optimization problem as follows:
Figure BDA00023647354300000612
Figure BDA00023647354300000613
Figure BDA00023647354300000614
Figure BDA00023647354300000615
Figure BDA00023647354300000616
whereas for deterministic prediction, the objective function aims to minimize the MAPE values on the validation set, similarly, by introducing auxiliary variables
Figure BDA00023647354300000617
The following linear optimization problem can be constructed:
Figure BDA0002364735430000071
Figure BDA0002364735430000072
Figure BDA0002364735430000073
Figure BDA0002364735430000074
3) testing phase
Pinball loss function and MAPE on the test set were selected as evaluation indices for probabilistic and deterministic predictions, respectively. The smaller the value, the better the prediction performance.
Example (b):
1. description of data
The smart meter measurement data of the user is from a smart meter dataset provided by London Low Carbon (LCL). The invention randomly picks out the measurement of the quantity of 36 users which are measured once every half hour from 1/2013 to 31/12/2013. And the number of clusters is selected according to N C =[1,2,4,8,16,32,36]The process is carried out. According to the user information statistical data provided by the LCL, 36 users can be divided into different clusters according to income and received electricity price policies. Wherein, there are several grades according to income as follows: affluence (Acorn-A), moderate (Acorn-H), and poor (Acorn-L). The following two categories can be classified according to the electricity rate policy: time of Use (ToU) and Standard electricity (Std).
2. Predicted results
To demonstrate the effectiveness of the proposed method we compared it with the k-Shape cluster based cumulative prediction method, in which we only use the RLP curve of the load to divide the users into k clusters, and then use the typical procedure of the cluster based method to form the final cumulative predicted load. Since determining the number of clusters is always a problem of the method, the invention tries the number of different kinds of clusters and adopts the result corresponding to the best-performing cluster number as the final prediction result. Thus, in the following discussion, P-k is used to denote the proposed method of the present invention, and C-k is used to denote the above-described comparison method.
Table 1 lists the deterministic predictions for the test set. The results show that the method based on complete dispersion shows the worst prediction performance due to the larger uncertainty of the user load, and C-4 in the compared method shows the best prediction performance under the consideration of different cluster numbers, even better than the method based on complete aggregation. However, the proposed method P-10 showed the best performance in all comparative methods, as shown in FIG. 1, which shows the predicted 168 hour load curves for P-10 and C-4. Despite the large uncertainty and volatility of peak load, the proposed method P-10 can learn it better than C-4.
The probabilistic predictions are shown in table 2. The Pinball loss function measures the reliability of probabilistic predictions. The present invention predicts quantiles of 20%, 40%, 60% and 80%, respectively, and represents them as Q20, Q40, Q60 and Q80. The values in bold represent the best results for each quantile prediction. C-4 and P-10 were chosen as methods of probabilistic prediction since they show the best performance in deterministic prediction. The results show that the clustering-based approach shows a great improvement in Pinball loss function compared to the full aggregation approach. Furthermore, the proposed method has the lowest Pinball loss function in almost all quantiles. As shown in FIGS. 2a and 2b, the probabilistic predictions of the 168 hour load curves for P-10 and C-4 are shown. The prediction interval formed by the predicted loads of different quantiles can well cover the actual load. Likewise, the width of the probability interval of P-10 is less than the width of the probability interval of C-4. To quantitatively predict the probability intervals, we calculated average intervals of 20% and 60% in table 3. The smaller the value, the higher the sharpness. Therefore, the above results show that the method has better reliability and sharpness.
TABLE 1 deterministic load prediction results
Figure BDA0002364735430000081
TABLE 2 probabilistic load prediction results
Figure BDA0002364735430000082
TABLE 3 mean probability prediction Interval
20% probability prediction Interval (kW) 60% probability prediction Interval (kW)
P-10 0.628 2.449
C-4 1.068 3.607
To visualize the optimized weights, we convert them into heat maps, as shown in fig. 3. w ═ w 1 ,w 2 ,...,w 7 ]The weight in (1) corresponds to the weight assigned to N C =[1,2,4,8,16,32,36]The number of clusters in (2) forms the weight of the result. The results show that for probabilistic and deterministic predictions, the weight w 1 Are typically large. And all predictions will be weighted w 7 The assignment is 0. Therefore, the weight corresponding to a prediction result with better performance is generally larger.
3. Clustering algorithm result based on information accumulation k-Shape
Since the predicted performance of P-10 is best, the analysis here accumulates the 10-Shape clustering results based on the information. The tree diagram is shown in fig. 4. It indicates that the energy consumption of the 12 th electricity consumer is very different from that of other consumers. Therefore, we cut the tree at the position where the whole power consumer is divided into 2 groups. And the clustering results are analyzed using the statistical data, as shown in fig. 5. The results show that the 12 th power consumer lives in poor wealth conditions and accepts standard price of electricity policy.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (8)

1. An accumulated load prediction method based on an information accumulation k-Shape clustering algorithm is characterized by comprising the following steps:
step 1, executing k-Shape clustering according to the Shape characteristics of the power load curve of a user;
the step 1 specifically comprises the following steps:
step 1.1, representing the electrical load curve as a training set
Figure FDA0003740917380000011
The load data set for user i is represented as
Figure FDA0003740917380000012
Wherein N is the number of power consumers, m is the length of the load sequence, N tr Is the size of the training set;
step 1.2, solving the NP-hard optimization problem in a heuristic mode to realize k-Shape clustering;
step 2, converting the division of the load curve into a similarity matrix of the load curve among the users through the information obtained by the combined clustering;
the step 2 specifically comprises the following steps:
step 2.1, after the clustering is finished, calculating each user u j In each cluster p i And is noted as the number of loads contained in
Figure FDA0003740917380000013
Wherein i 1.. k, j 1.. N;
step 2.2, defining the similarity matrix
Figure FDA0003740917380000014
Is composed of
Figure FDA0003740917380000015
Step 3, converting the similarity matrix into a distance matrix;
step 4, applying a hierarchical clustering algorithm based on single relation on the distance matrix to obtain a hierarchical structure describing the distance between each user;
step 5, selecting different clustering numbers according to the hierarchical structure obtained in the step 4 to obtain different cluster partitions of the user, training a learning model, and performing probabilistic prediction and deterministic prediction on the cumulative load of the user;
and 6, determining the weight of the cumulative load prediction results of the probabilistic prediction and the deterministic prediction of each cluster partition, and combining the cumulative load prediction results of the probabilistic prediction and the deterministic prediction of each cluster partition into a final cumulative load prediction result.
2. The accumulated load prediction method based on the information accumulated k-Shape clustering algorithm of claim 1, wherein the specific formula of the NP-hard optimization problem of step 1.2 is as follows:
Figure FDA0003740917380000016
wherein the content of the first and second substances,
Figure FDA0003740917380000017
is a cluster p j E.g. the center of mass of P,
Figure FDA0003740917380000018
is about a sequence of length m
Figure FDA0003740917380000019
Is measured by the shape distance of (a).
3. The method of claim 2, wherein the cumulative load prediction method based on the information cumulative k-Shape clustering algorithm is characterized in that
Figure FDA00037409173800000110
The concrete formula of (1) is as follows:
Figure FDA00037409173800000111
wherein the content of the first and second substances,
Figure FDA00037409173800000112
is a measure of the cross-correlation of sequences, w 1.., 2 m-1;
Figure FDA00037409173800000113
Figure FDA0003740917380000021
4. the method for predicting the cumulative load based on the information cumulative k-Shape clustering algorithm as claimed in claim 1, wherein the step 3 specifically comprises the following steps:
converting the similarity matrix into the distance matrix by D ═ I-S
Figure FDA0003740917380000022
Wherein the matrix
Figure FDA0003740917380000023
Is 1.
5. The method for predicting cumulative load based on information accumulation k-Shape clustering algorithm as claimed in claim 1, wherein the hierarchical structure in the step 4 is described by using clustering tree.
6. The method for predicting cumulative load based on information cumulative k-Shape clustering algorithm as claimed in claim 5, wherein said step 5 comprises the following steps:
by selecting different numbers of clusters
Figure FDA0003740917380000024
Get | N for N users C L different cluster division modes;
for the ith cluster partition, training is required
Figure FDA0003740917380000025
A model, the first
Figure FDA0003740917380000026
A model f i,j Training on the jth cluster, i.e.
Figure FDA0003740917380000027
Wherein n is i,j Is the number of users on the jth cluster in the ith partition;
for probabilistic prediction, the probabilistic prediction value for the quantile q is expressed as:
Figure FDA0003740917380000028
for deterministic prediction, the model is trained
Figure FDA0003740917380000029
To obtain
Figure FDA00037409173800000210
The predicted result of (2);
for the ith cluster partition, the cumulative load prediction for probabilistic prediction is expressed as
Figure FDA00037409173800000211
The cumulative load prediction for deterministic prediction is represented as
Figure FDA00037409173800000212
7. The method for predicting cumulative load based on information accumulated k-Shape clustering algorithm as claimed in claim 6, wherein the weight of the cumulative load prediction result of probabilistic prediction of each cluster partition determined in the step 6 is specifically an optimization problem:
Figure FDA00037409173800000213
Figure FDA00037409173800000214
Figure FDA00037409173800000215
w i,q ≥0
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA00037409173800000216
the load is the result of the cumulative load prediction of the ith clustering mode in the time l; the objective function is to minimize pinball loss function on the validation set; the pinball loss function is:
Figure FDA00037409173800000217
wherein N is va Is the number of samples on the validation set.
8. The method for predicting the cumulative load based on the information cumulative k-Shape clustering algorithm as claimed in claim 6, wherein the weight of the cumulative load prediction result of the deterministic prediction of each cluster partition determined in the step 6 is specifically an optimization problem:
Figure FDA0003740917380000031
Figure FDA0003740917380000032
Figure FDA0003740917380000033
Figure FDA0003740917380000034
wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003740917380000035
the objective function is to minimize the MAPE value on the validation set.
CN202010032213.8A 2020-01-13 2020-01-13 Accumulated load prediction method based on information accumulation k-Shape clustering algorithm Active CN111291782B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010032213.8A CN111291782B (en) 2020-01-13 2020-01-13 Accumulated load prediction method based on information accumulation k-Shape clustering algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010032213.8A CN111291782B (en) 2020-01-13 2020-01-13 Accumulated load prediction method based on information accumulation k-Shape clustering algorithm

Publications (2)

Publication Number Publication Date
CN111291782A CN111291782A (en) 2020-06-16
CN111291782B true CN111291782B (en) 2022-09-09

Family

ID=71022330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010032213.8A Active CN111291782B (en) 2020-01-13 2020-01-13 Accumulated load prediction method based on information accumulation k-Shape clustering algorithm

Country Status (1)

Country Link
CN (1) CN111291782B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112598157B (en) * 2020-11-27 2023-03-24 广东电网有限责任公司东莞供电局 Prediction method and device of power load
CN112653126A (en) * 2020-11-30 2021-04-13 中南大学 Power grid broadband oscillation online identification method and system
CN113361776A (en) * 2021-06-08 2021-09-07 国网上海市电力公司 Power load probability prediction method based on user power consumption behavior clustering

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108574290A (en) * 2018-04-12 2018-09-25 国家电网有限公司 Oscillation source localization method, device, terminal and the readable storage medium storing program for executing of forced oscillation
CN108596362A (en) * 2018-03-22 2018-09-28 国网四川省电力公司经济技术研究院 It polymerize approximate electric load curve form clustering method based on adaptive segmentation
CN110390440A (en) * 2019-07-29 2019-10-29 东北大学 A kind of intelligent electric meter user's aggregate load prediction technique based on cluster and deep neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596362A (en) * 2018-03-22 2018-09-28 国网四川省电力公司经济技术研究院 It polymerize approximate electric load curve form clustering method based on adaptive segmentation
CN108574290A (en) * 2018-04-12 2018-09-25 国家电网有限公司 Oscillation source localization method, device, terminal and the readable storage medium storing program for executing of forced oscillation
CN110390440A (en) * 2019-07-29 2019-10-29 东北大学 A kind of intelligent electric meter user's aggregate load prediction technique based on cluster and deep neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
An Ensemble Forecasting Method for the Aggregated Load With Subprofiles;Yi Wang et al.;《IEEE TRANSACTIONS ON SMART GRID》;20180221;第3906-3908页 *
Combining Probabilistic Load Forecasts;Yi Wang et al.;《IEEE TRANSACTIONS ON SMART GRID》;20180508;第3664-3674页 *

Also Published As

Publication number Publication date
CN111291782A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
CN111428816B (en) Non-invasive load decomposition method
CN111291782B (en) Accumulated load prediction method based on information accumulation k-Shape clustering algorithm
Wong et al. A simple way to use interval data to segment residential customers for energy efficiency and demand response program targeting
CN107506905A (en) A kind of improved Sustainable Development of Power Grid Company integrated evaluating method
CN101728868B (en) Method for classification and forecast of remote measuring power load patterns
WO2020098728A1 (en) Cluster load prediction method and apparatus, and storage medium
CN111860977A (en) Probability prediction method and probability prediction device for short-term load
CN115375205B (en) Method, device and equipment for determining water user portrait
CN111046913A (en) Load abnormal value identification method
CN114048920A (en) Site selection layout method, device, equipment and storage medium for charging facility construction
CN115907822A (en) Load characteristic index relevance mining method considering region and economic influence
CN115267575A (en) Energy storage battery life prediction algorithm based on Transformer
CN110390440B (en) Clustering and deep neural network-based intelligent ammeter user aggregate load prediction method
CN116579884B (en) Power user behavior analysis method and system
Afzalan et al. Semantic search in household energy consumption segmentation through descriptive characterization
CN112508254A (en) Method for determining investment prediction data of transformer substation engineering project
CN114372835B (en) Comprehensive energy service potential customer identification method, system and computer equipment
CN111797924B (en) Three-dimensional garden portrait method and system based on clustering algorithm
Yang et al. Short-term demand forecasting for bike sharing system based on machine learning
Shahoud et al. Descriptive statistics time-based meta features (DSTMF) constructing a better set of meta features for model selection in energy time series forecasting
CN113627821A (en) Method and system for identifying abnormal electricity utilization based on electricity utilization behavior characteristics
Davarzani et al. Study of missing meter data impact on domestic load profiles clustering and characterization
Ünal et al. A new clustering approach for monthly electricity consumption data
Leyli-Abadi et al. Mixture of Joint Nonhomogeneous Markov Chains to Cluster and Model Water Consumption Behavior Sequences
Zhao et al. Research on multi dimensional power consumption information decoupling and separation optimization algorithm based on power supply big data and time-varying characteristics of user profiles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant