CN111582567A

CN111582567A - Wind power probability prediction method based on hierarchical integration

Info

Publication number: CN111582567A
Application number: CN202010348291.9A
Authority: CN
Inventors: 金怀平; 石立贤; 金怀康
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2020-04-28
Filing date: 2020-04-28
Publication date: 2020-08-25
Anticipated expiration: 2040-04-28
Also published as: CN111582567B

Abstract

The invention discloses a wind power probability prediction method based on hierarchical integration. The method comprises the steps of constructing a subspace set through resampling and a partial least square method, obtaining a plurality of local regions on each subspace by utilizing GMM clustering, establishing a corresponding local GPR model, and establishing a first-layer integration model by utilizing a Bayesian inference strategy and a finite mixing mechanism to fuse local models. And selecting a proper first-layer integration model by adopting a genetic algorithm to perform selective self-adaptive integration to obtain a regression probability prediction model of a selective hierarchical integration Gaussian process. In order to solve the problem of performance deterioration caused by changes of wind power data characteristics, a prediction model has the capability of self-adaptive updating by introducing a self-adaptive updating strategy. The method uses the selective hierarchical ensemble learning framework for the ultra-short-term wind power prediction, and compared with the traditional ensemble learning prediction method, the method has higher prediction precision and stability, and the generated prediction interval can provide effective reference for power scheduling.

Description

Wind power probability prediction method based on hierarchical integration

Technical Field

The invention relates to the technical field of wind power prediction, in particular to a wind power probability prediction method based on hierarchical integration.

Background

Wind energy is a renewable energy source which is pollution-free and widely distributed, and the wind power generation technology is rapidly developed in recent years. However, due to the randomness and the fluctuation of wind energy, unstable wind power grid-connection impacts the safety and stability of a power system, so that the stable operation of equipment of a power grid is influenced. Therefore, the wind power prediction is accurate and efficient, reasonable power scheduling can be effectively promoted, reliable reference is provided for power grid arrangement power generation planning and shutdown maintenance, and the system is guaranteed to be safe, reliable and economical to operate. The wind power prediction plays a crucial role in the development of the power generation industry towards the environment protection and cleanness direction, and has great engineering application value.

The ensemble learning is a strategy for completing a learning task by constructing and combining a plurality of sub-models, and the ensemble learning can obtain better performance than a single model, so that the ensemble learning is widely applied to the field of wind power prediction. As we know, high performance and rich diversity of submodels can integrate better performance. However, most wind power prediction research aiming at ensemble learning neglects the diversity of building sub-models from input data, which is not favorable for obtaining sub-models with abundant diversity. In addition, as the prediction time of the model becomes longer, since the model is built by using historical data, a concept drift phenomenon inevitably occurs, and therefore the model should have a certain adaptive capacity. The self-adaptation of the integrated model is composed of two parts, namely, a sub-model has certain self-adaptation updating capability, and the weight value of the integrated sub-model is not fixed and is self-adaptively changed. However, the problem of adaptation of the integration model has only been discussed in recent studies.

Finally, due to the characteristics of strong randomness and high uncertainty of wind energy, the traditional single-point prediction cannot make a good estimation on the uncertainty of wind, and for the stability of a power system, the grid connection of wind power needs to accurately estimate the fluctuation range of the wind power, and the single-point prediction is far from sufficient. Therefore, a probabilistic modeling method capable of generating a probabilistic prediction interval should be applied to the submodel.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a wind power probability prediction method based on hierarchical integration, which effectively improves the accuracy and stability of a prediction model.

The invention adopts the following technical scheme for solving the technical problems: a wind power probability prediction method based on hierarchical integration comprises the following steps:

selecting historical meteorological data D of a section of wind power plant as a modeling sample set, and dividing the sample set into training sets D_trainVerification set D_valAnd test set D_testUsing Bootstrapping method to pair D_trainPerforming multiple resampling to obtain L sub-sample sets { (X)₁，y₁)，...，(X_L，y_L) And selecting input characteristic variables of the sub-sample set by using a Partial Least Squares (PLS) method, sequencing importance, deleting the same sample subset, and constructing N subspaces (S)₁，...，S_NSaving input characteristic variable indexes of training set samples corresponding to the N subspaces;

step (2) mapping the index of the subspace to a training set D_trainObtaining N subspace training data sets { D_tra，1，...，D_tra，NAnd then clustering is performed on each subspace by using a Gaussian mixture model GMM, and then a data set D is supposed to be trained in the ith subspace_tra，iGet z local regions { LD₁，LD₂，...，LD_zModeling by using Gaussian process regression on each local area to obtain a GPR model set (GPR)₁，GPR₂，...，GPR_z}; for a new sample x^*Obtaining the prediction output of the first-layer integrated EGPR model on the ith subspace by utilizing a Bayesian inference strategy and a finite mixing mechanism; similarly, N subspaces can obtain N first-layer integrated EGPR models { EGPR₁，EGPR₂，...，EGPR_NThe predicted output of (c) };

step (3) according to the step (2), calculating a verification set D_valIntegrating the prediction precision RMSE and standard deviation STD of the EGPR models on the first N layers, weighting and mixing the RMSE and the STD to be used as an optimization target of model selection, and selecting the performance by using a genetic algorithmThe good and stable first-layer integrated EGPR models are assumed to be selected and used as sub-models of the second-layer integration;

integrating the sub-models integrated on the second layer by using a self-adaptive integration mode to obtain a final SHEGPR model;

and (5) updating the local region LD, the GPR model and the GMM model along with the increase of the prediction time.

Further, in the step (1), the historical meteorological data D is meteorological data and operation data of the wind power plant in the past 2-4 months, D is { x, y },

wherein p is the number of samples, q is f × l, wherein f is the number of input features, l is the number of delay variables, y is the predicted power, and the input features comprise historical wind speed W_SHistorical power P and historical wind direction W_D。

Further, the specific process of performing feature selection on the sub-sample set by the Partial Least Squares (PLS) in the step (1) is as follows:

① training the L sub-sample sets with PLS to obtain regression coefficients β on the sub-sample sets_rWherein

r ∈ { 1.., L }, which represents the importance of the features in input X to y on this subsample set;

② pairs β_rThe data in (1) are sorted from large to small to obtain { b₁，b₂，b₃，...，b_qJudging according to the formula (1):

in the formula (1), b_iIs β_rTh is set to be 0.8-0.9; if the formula (1) is established, storing indexes corresponding to the first i characteristics;

and thirdly, repeating the step two until L subspaces are selected from the L sample subsets, and deleting the repeated subspaces to obtain the final N subspaces.

Further, the process of clustering the subspace by using the gaussian mixture model GMM and establishing the first-layer integrated EGPR model in the step (2) is as follows:

in training set D_trainAnd on, setting the nth subspace,

wherein p is the number of samples, and c is the number of features in the subspace; setting the maximum clustering number v, establishing a GMM model by the nth subspace, and setting the nth subspace data to be gathered into z types, wherein z is less than or equal to v, namely z local regions { LD₁，LD₂，...，LD_z}; then, local models are built for the z local regions by using Gaussian process regression to obtain z GPR models which are marked as { GPR₁，GPR₂，...，GPR_z}；

In detail, for a new sample x^*The ith local region GPR model can be described as

In the formula (3), k_i，*＝[C(x^*，x_i，1)，...，C(x^*，x_i，p)]C is represented as a positive definite covariance matrix of p × p,

and

GPR as submodel respectively_iThe predicted mean and variance of;

in the actual prediction process, for a new sample x^*It is assumed that, at the nth subspace,

the local area number after GMM clustering is z, then x^*The posterior probability of (2) is obtained by a Bayesian inference strategy:

in formula (4), i ∈ {1, 2, 3.., z }, LD_iRepresents the ith local area; p (x)^*|LD_i) Is conditional probability, P (LD)_i) Is a prior probability; the predicted output on the nth subspace can be obtained by the finite mixture mechanism as:

in the formula (5), the reaction mixture is,

predicted value for the i-th local area GPR model, P (LD)_i|x^*) Is the joint posterior probability;

similarly, the mixed variance can be calculated as:

in the formula (6), the reaction mixture is,

GPR as a local model_iThe predicted variance of (c);

then for a new sample x^*On the nth subspace, the prediction output and the prediction variance of the first-layer prediction EGPR model are as follows:

further, the detailed process of step (3) is as follows:

① mapping the index of the subspace to the validation set D_valObtaining verification data sets { D) on N subspaces_val，1，...，D_val，NObtaining N EGPR models according to the step (2), and obtaining the EGPR models in a verification set D_valThereby obtaining N EGPR modulesThe predicted output of type is

Setting initial population number and iteration number of the genetic algorithm, and taking the prediction precision of the EGPR model and the mixed standard deviation and weighted sum as a target function:

f_obi＝λRMSE+(1-λ)σ (8)

in the formula (8), lambda is a parameter between 0 and 1, sigma is a predicted mixed standard deviation, and RMSE represents a root mean square error in the optimization process;

the further detailed process is as follows: supposing that in the process of certain optimization, the prediction outputs of m selected EGPRs are integrated by using a simple average mode to obtain an integrated prediction result

It is calculated as follows:

in the formula (9), m is the number of EGPR models which are selected currently; the RMSE compared to the real values is then:

in the formula (10), N_valTo verify the set D_valThe number of the middle samples;

find min { f) through multiple iterations_objAt verification set D_valAnd selecting the model with good performance, and storing the index of the model.

Further, the detailed process of step (4) is as follows:

assuming that the number of EGPR models selected according to the step (3) is M, when a new test sample x is predicted^*Temporal, second layer integration prediction output

And the predicted variance

Comprises the following steps:

wherein the content of the first and second substances,

for the output of the i-th EGPR model selected, w_iFor integrated weights, then w_iAs follows:

wherein the content of the first and second substances,

in the case of a conditional probability,

for a priori probability, each model is assumed without some a priori knowledge

Are equal and have a value of

Can be expressed as:

wherein, γ is a parameter for controlling the weight.

Further, the detailed process of step (5) is as follows:

when a new one is formedSample (x) of (2)_t+1，y_t+1) At the time of arrival, a new sample (x) is first estimated_t+1，y_t+1) Posterior probability belonging to different local areas is selected, and then the EGPR model is updated by selecting the value with the maximum posterior probability, and a new sample point x is assumed_t+1At LD_kHas the maximum posterior probability (LD)_k|x_t+1) The update operation then includes two steps:

① covariance matrix ∑ in GPR model for the kth local region by moving window_GPRUpdating is carried out;

② mean vector μ in the kth local area by incremental update_kCovariance matrix ∑_kMixing coefficient of and GMM_kUpdating:

π_k ^(t+1)＝π_k ^(t)+α(P(k|x_t+1)-π_k ^(t)) (17)

wherein α is

T is the number of samples taken used in the mix update.

The invention has the following characteristics: according to the wind power probability prediction method based on hierarchical integration, firstly, data diversity is generated from two disturbance angles of sample information and characteristic information, a diversity subspace is established through characteristic selection, modeling is carried out after the subspace is clustered by using GMM, training speed is increased, and performance of the subspace after mixed modeling is remarkably improved. And then, pruning the sub-model after the first layer of integration in an optimized mode, so that the performance of the second layer of integration model is improved, and the operation complexity in the self-adaptive updating process is reduced. And finally, performing weighted fusion on the second layer of sub-models by using a self-adaptive integration mode, so that the final SHEGPR model has certain self-adaptive capacity. According to the invention, the GPR is used as a modeling sub-model, so that the integrated SHEGPR model not only has better prediction performance, but also can give a prediction interval.

Compared with the prior art, the invention has the beneficial effects that: according to the method, a selective hierarchical ensemble learning framework is used for wind power prediction in an ultra-short term, compared with the traditional ensemble learning prediction method, the method has higher prediction accuracy and stability, and the generated prediction interval can provide effective reference for power scheduling.

Drawings

FIG. 1 is a flowchart of the prediction of the SHEGPR wind power;

FIG. 2 is a three-dimensional map of the mapping relationship between the power of the wind farm and the wind speed and direction;

FIG. 3 is a GPR and EGPR comparison diagram on a 4h wind power prediction subspace;

FIG. 4 is a wind power prediction trend graph with prediction intervals of 15min, 1h, 2h and 4 h;

Detailed Description

The technical solution of the present invention is further described in detail below with reference to the accompanying drawings and specific embodiments.

Example 1

As shown in fig. 1, in this embodiment, wind power data of a certain wind farm in the renewable energy laboratory (NREL) in the united states is taken as an example, wherein historical wind speed, historical power, and historical wind direction data are taken as inputs, a delay variable is set to 8, and power is taken as output of the SHEGPR.

Step 1: selecting historical data (96 data points in 1 day) of wind power, wind speed and wind direction with time resolution of 15 minutes in a certain wind power plant of a renewable energy laboratory (NREL) in America at 1-3 months, and dividing the data into a training set D in sequence_train(3000) Verification set D_val(1000) And test set D_test(4000) The specific mapping relationship between the power of the wind farm and the wind speed and direction is shown in fig. 2.

Step 2: using Bootstrapping mode to pairD_trainPerforming multiple resampling to obtain L sub-sample sets { (X)₁，y₁)，...，(X_L，y_L) And (5) performing importance ranking on the characteristics of the samples by using a Partial Least Squares (PLS), repeating R times and deleting a repeated subspace to obtain N D_trainSubspace { S }₁，...，S_NAnd saving feature indexes of the training samples corresponding to the N subspaces.

The process of feature selection of the sub-sample set by Partial Least Squares (PLS) is as follows:

① pairs of training set D_trainPerforming Z-score normalization, training the L subset with PLS, determining the number of principal components of PLS by cross validation to obtain subset regression coefficients β_rWherein

r ∈ { 1.., L }, which represents the importance of the features in input X to y on this subset.

② pairs β_rThe data in (1) are sorted from large to small to obtain { b₁，b₂，b₃，...，b_q}, judging

Wherein, b_iIs β_rTh is set to 0.85. If the above formula is true, the indexes corresponding to the first i characteristics are saved.

① mapping the index of the subspace to D_trainObtaining N subspace training data sets { D_tra，1，...，D_tra，NAnd setting the maximum clustering number of the GMM as v, then establishing a Gaussian Mixture Model (GMM) for each subspace training data set, and storing the GMM on each subspace to obtain N GMM models. Suppose that the data set D is trained in the ith subspace_tra，iClustering to obtain z bureausPartial region { LD₁，LD₂，...，LD_z}. The GMM algorithm described above is:

for any one

Is provided with

Wherein the content of the first and second substances,

is the model parameter of GMM, c is the number of Gaussian components, λ_kIs the weight of the kth Gaussian component, mu_k，∑_kRespectively representing the mean and covariance matrix of the kth gaussian component, and the parameters of the GMM model are found by the expectation-maximization algorithm.

② pairs of D_tra，iZ local regions of (LD) {₁，LD₂，...，LD_zModeling each LD in the model by using Gaussian Process Regression (GPR) to obtain a GPR model set which is marked as { GPR₁，GPR₂，...，GPR_z}. In detail, for a new sample x^*The ith local region GPR model can be described as:

wherein k is_i，*＝[C(x^*，x_i，1)，...，C(x^*，x_i，p)]And C is represented as a positive definite covariance matrix of p × p.

And

GPR as submodel respectively_iThe predicted mean and variance of.

And thirdly, repeating the step two for N times, and establishing a GPR model set for the N subspace training data sets.

And 4, step 4: mapping index of subspace to D_valObtaining N subspace verification data sets { D_val，1，...，D_val，NZ-score normalization of the data on each subspace. And then obtaining N EGPR models in the { D (proportion of absolute difference) according to the N GMM models, the GPR model set and the Bayesian inference strategy built in the step 3 and a finite mixing mechanism_val，1，...，D_val，NThe predicted output and variance are respectively

Specifically, the following process is established for an EGPR model:

assume for a new sample x^*It is assumed that, at the nth subspace,

wherein, i ∈ {1, 2, 3.., z }, LD_iRepresenting the ith local area. P (x)^*|LD_i) Is conditional probability, P (LD)_i) Is a prior probability.

The predicted output on the nth subspace can be obtained by the finite mixture mechanism as:

wherein the content of the first and second substances,

predicted value for the i-th local area GPR model, P (LD)_i|x^*) Is the joint posterior probability.

Similarly, the mixed variance can be calculated as:

wherein the content of the first and second substances,

GPR as a submodel_iThe predicted variance of (2).

and 5: this step constructs an optimization problem to select EGPR models for the second level integration; first, it is known that the first layer integration obtains N EGPR models, i.e., { EGPR₁，EGPR₂，...，EGPR_NBinary coding the indexes of all EGPR models, wherein 1 represents that the model is selected, and 0 represents that the model is not selected; then, taking the prediction precision and the mixed standard deviation and the weighted sum of the EGPR model obtained in the step 4 on the verification set as a target function, adopting a Genetic Algorithm (GA) as an optimization algorithm, and searching for min { f } through multiple iterations_objAnd selecting the models with good performance and difference, storing the indexes of the models, and assuming that M excellent EGPR models are finally selected for second-layer integration.

The optimization objective is specifically constructed as follows:

f_obj＝λRMSE+(1-λ)σ (8)

wherein λ is a parameter between 0 and 1, σ is a predicted mixed standard deviation, and RMSE represents the root mean square error in the optimization process, as detailed below:

supposing that in the process of certain optimization, the prediction outputs of m selected EGPRs are integrated by using a simple average mode to obtain an integrated prediction result

It is calculated as follows

Wherein m is the number of the EGPR models which are selected currently, and the RMSE obtained by comparing the real value is as follows:

wherein N is_valTo verify the set D_valThe number of the middle samples;

step 6: for the on-line prediction phase, test set D is applied_testSample x of^*The prediction is carried out by the following steps:

① mapping index of subspace to x^*Obtaining N subspace training data sets

Z-score normalizing the data on each subspace; the same as the step 4, the N EGPR models are obtained according to the N GMM models, the GPR model set and the Bayesian inference strategy built in the step 3 and the finite mixing mechanism

Respectively the prediction output and the variance of

② second-level integration of prediction outputs for the M EGPR models selected in step 5 by means of variance integration

And the predicted variance

Comprises the following steps:

wherein the content of the first and second substances,

wherein

In the case of a conditional probability,

is a prior probability. Assuming each model without some a priori knowledge

Are equal and have a value of

Can be expressed as:

wherein, γ is a parameter for controlling the weight.

③ finally, for a test sample, the prediction interval range at 95% confidence interval is

And 7: when the prediction time is longer, the performance of the model is not enough to be degraded, so that the model is matchedAdaptive updating of the model becomes necessary. When a new sample (x)_t+1，y_t+1) At the time of arrival, a new sample (x) is first estimated_t+1，y_t+1) Posterior probability belonging to different local areas is selected, and then the EGPR model is updated by selecting the value with the maximum posterior probability, and a new sample point x is assumed_t+1At LD_kHas the maximum posterior probability (LD)_k|x_t+1) The update operation then includes two steps:

① covariance matrix ∑ in GPR model for the kth local region by moving window_GPRAnd (6) updating.

π_k ^(t+1)＝π_k ^(t)+α(P(k|x_t+1)-π_k ^(t)) (17)

wherein α is

T is the number of samples taken used in the mix update.

The implementation case of the invention adopts the root mean square error RMSE and the decision coefficient R2 to evaluate the prediction effect, and the evaluation is defined as:

in the formula, N_testTo test the number of samples, y_i，

The actual value and the predicted value of the ith sample,

is the average of the actual values.

The invention compares the following methods: (1) a GPR global model; (2) a continuous method; (3) a gaussian process regression (SHEGPR) model based on selective hierarchical integration. (4) The gaussian process regression shegpr (with update) model (example 1) based on selective hierarchical integration with adaptive updating, the comparison results are shown in table 1 and table 2.

TABLE 1 comparison of predicted Performance at 2 hours Advance for different prediction methods

TABLE 2 comparison of predicted performance at 4 hours in advance for different prediction methods

As can be seen from tables 1 and 2, the method proposed in this example is a great improvement over the GPR global and persistence methods, both from RMSE and R²The effectiveness and universality of the invention can be proved by significant improvement. Unfortunately, the global GPR model is only comparable to the persistence method performance because the GPR modeling data uses historical samples, the performance in the test set degrades due to conceptual drift, and the persistence method uses the latest sample information as the prediction idea to output the latest previous sample as the next prediction. Therefore, in order to predict the power more accurately, the adaptive updating of the model is a key part in wind power prediction.

As can be seen from FIG. 3, the performance of the EGPR model obtained by clustering with GMM in subspace and then modeling with GPR is significantly different from that of the GPR model in subspace. Therefore, the method for building the sub-model according to the category after the GMM is clustered, which is provided by the invention, has the advantages of higher speed and better performance. Fig. 4 is a 15min, 1h, 2h and 4h wind power prediction trend curve chart based on the shegpr (with update) method from top to bottom, respectively, and it can be seen that the predicted value and the actual value are better fitted. It goes without saying that the shorter the prediction time, the better the fit. It is worth mentioning that the method not only can predict the trend of the wind power, but also can obtain the prediction interval to evaluate the uncertainty of the wind power, and the prediction interval provides powerful guarantee for the stable scheduling of the power system. As can be seen from fig. 4, the shorter the prediction time is, the narrower the 95% confidence interval is, which indicates that the interval prediction is more effective, and is more beneficial to the stable scheduling of the power system.

It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims

1. A wind power probability prediction method based on hierarchical integration is characterized by comprising the following steps:

selecting historical meteorological data D of a section of wind power plant as a modeling sample set, and dividing the sample set into training sets D_trainVerification set D_valAnd test set D_testUsing Bootstrapping method to pair D_trainPerforming multiple resampling to obtain L sub-sample sets { (X)₁，y₁)，...，(X_L，y_L) And selecting input characteristic variables of the sub-sample set by using a Partial Least Squares (PLS) method, sequencing importance, deleting the same sample subset, and constructing N subspaces (S)₁，...，S_NSaving N subspace pairsInputting characteristic variable indexes of a sample of a training set;

step (2) mapping the index of the subspace to a training set D_trainObtaining N subspace training data sets { D_tra，1，...，D_tra，NAnd then clustering is performed on each subspace by using a Gaussian mixture model GMM, and then a data set D is supposed to be trained in the ith subspace_tra，iGet z local regions { LD₁，LD₂，...，LD_zModeling by using Gaussian process regression on each local area to obtain a GPR model set (GPR)₁，GPR₂，...，GPR_z}; for a new sample x^*Obtaining the prediction output of the first-layer integrated EGPR model on the ith subspace by utilizing a Bayesian inference strategy and a finite mixing mechanism; similarly, N subspaces are provided, and N first-layer integrated EGPR models { EGPR₁，EGPR₂，...，EGPR_NThe predicted output of (c) };

step (3) according to the step (2), calculating a verification set D_valThe prediction precision RMSE and the standard deviation STD of the N first-layer integrated EGPR models are weighted and mixed to serve as an optimization target of model selection, the first-layer integrated EGPR models with good performance and stability are selected by utilizing a genetic algorithm, and the M first-layer integrated EGPR models are supposed to be selected and serve as sub-models of second-layer integration;

2. The wind power probability prediction method based on the hierarchical integration as claimed in claim 1, wherein in the step (1), historical meteorological data D is meteorological data and operation data of a wind power plant in the past 2-4 months, and D ═ x, y },

wherein p is the number of samples, q ═ f × l, wheref is the number of input features, l is the number of delay variables; y is the predicted power; the input features include historical wind speed W_SHistorical power P and historical wind direction W_D。

3. The wind power probability prediction method based on hierarchical integration according to claim 2, wherein the specific process of performing feature selection on the sub-sample set by Partial Least Squares (PLS) in the step (1) is as follows:

r ∈ { 1.., L }, which represents the importance of the variable in input X to y on this subsample set;

in the formula (1), b_iIs β_rTh of the ith data is set to be 0.8-0.9; if the formula (1) is established, storing indexes corresponding to the first i characteristics;

4. The wind power probability prediction method based on hierarchical integration according to claim 1, wherein the process of clustering the subspace by using the Gaussian mixture model GMM and establishing the first-layer integrated EGPR model in the step (2) is as follows:

in training set D_trainAnd on, setting the nth subspace,

wherein p is the number of samples, and c is the number of features in the subspace; setting the maximum clustering number v, establishing a GMM model by the nth subspace, and setting the nth subspace data to be gathered into z types, wherein z is less than or equal to v, namely z local regions { LD₁，LD₂，...，LD_z}; then, local models are built for the z local areas by Gaussian process regression to obtain z GPR models which are marked as { GPR₁，GPR₂，...，GPR_z}；

For a new sample x^*The ith local region GPR model can be described as

and

GPR as submodel respectively_iThe predicted mean and variance of;

in formula (4), i ∈ {1, 2, 3.., z }, LD_iRepresents the ith local area; p (x)^*|LD_i) Is conditional probability, P (LD)_i) Is a prior probability; then pass through a limited mixerThe predicted output at the nth subspace can be made to be:

in the formula (5), the reaction mixture is,

similarly, the mixed variance can be calculated as:

in the formula (6), the reaction mixture is,

GPR as a submodel_iThe predicted variance of (c);

5. the wind power probability prediction method based on hierarchical integration according to claim 1, wherein the detailed process of the step (3) is as follows:

① mapping the index of the subspace to the validation set D_valObtaining verification data sets { D) on N subspaces_val，1，...，D_val，NObtaining N EGPR models according to the step (2), and obtaining the EGPR models in a verification set D_valIn the above, the prediction outputs of N EGPR models are obtained as

f_obj＝λRMSE+(1-λ)σ (8)

It is calculated as follows:

6. The wind power probability prediction method based on hierarchical integration according to claim 1, wherein the detailed process of the step (4) is as follows:

And the predicted variance

Comprises the following steps:

wherein the content of the first and second substances,

wherein the content of the first and second substances,

in the case of a conditional probability,

for a priori probability, each model is assumed without some a priori knowledge

Are equal and have a value of

Can be expressed as:

wherein, γ is a parameter for controlling the weight.

7. The wind power probability prediction method based on hierarchical integration according to claim 1, wherein the detailed process of the step (5) is as follows:

when a new sample (x)_t+1，y_t+1) At the time of arrival, a new sample (x) is first estimated_t+1，y_t+1) Posterior probability belonging to different local areas is selected, and then the EGPR model is updated by selecting the value with the maximum posterior probability, and a new sample point x is assumed_t+1At LD_kHas the maximum posterior probability (LD)_k|x_t+1) The update operation then includes two steps:

π_k ^(t+1)＝π_k ^(t)+α(P(k|x_t+1)-π_k ^(t)) (17)

wherein α is

T is the number of samples taken used in the mix update.