CN110674965A - Multi-time step wind power prediction method based on dynamic feature selection - Google Patents
Multi-time step wind power prediction method based on dynamic feature selection Download PDFInfo
- Publication number
- CN110674965A CN110674965A CN201910401140.2A CN201910401140A CN110674965A CN 110674965 A CN110674965 A CN 110674965A CN 201910401140 A CN201910401140 A CN 201910401140A CN 110674965 A CN110674965 A CN 110674965A
- Authority
- CN
- China
- Prior art keywords
- variables
- anfis
- input
- variable
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 239000002245 particle Substances 0.000 claims abstract description 19
- 230000001537 neural effect Effects 0.000 claims abstract description 10
- 230000003044 adaptive effect Effects 0.000 claims abstract description 9
- 238000005457 optimization Methods 0.000 claims abstract description 8
- 238000001914 filtration Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 24
- 239000013598 vector Substances 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 14
- 210000002569 neuron Anatomy 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 claims description 8
- 230000001186 cumulative effect Effects 0.000 claims description 7
- 238000005315 distribution function Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 101100233118 Mus musculus Insc gene Proteins 0.000 claims description 4
- 230000003247 decreasing effect Effects 0.000 claims description 3
- 238000010801 machine learning Methods 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 238000005065 mining Methods 0.000 abstract description 4
- 238000010248 power generation Methods 0.000 abstract description 4
- 230000000694 effects Effects 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 4
- 101100161290 Mus musculus Nt5c1b gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/043—Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/048—Fuzzy inferencing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/06—Energy or water supply
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Mathematical Physics (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Automation & Control Theory (AREA)
- Fuzzy Systems (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Tourism & Hospitality (AREA)
- Public Health (AREA)
- Primary Health Care (AREA)
- Water Supply & Treatment (AREA)
- Development Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a multi-time step wind power prediction method based on dynamic feature selection. The method provides an intelligent hybrid model method for mining historical power time series and public numerical weather forecast (NWP) data by using a dynamic feature extraction algorithm, so as to solve the problem that wind power generation is difficult to predict under different time steps. The method adopts a minimum redundancy maximum correlation (mRMR) dynamic filtering method on the basis of available original data to automatically select input variables with different prediction step sizes; then, carrying out supervised learning on the input data with the optimal characteristics through an Adaptive Neural Fuzzy Inference System (ANFIS); the model trains ANFIS parameters by applying a Particle Swarm Optimization (PSO) algorithm to achieve an optimal prediction effect; and finally, evaluating the proposed hybrid intelligent model through the operation data of the actual distributed wind turbine generator, and verifying the effectiveness of the model through experimental results.
Description
Technical Field
The invention relates to the field of wind power prediction of multiple time steps of new energy power generation, in particular to a multi-time step wind power prediction method based on dynamic feature selection.
Background
In recent years, the urgent need of low-carbon economy and the progress of wind power technology are promoting the rapid sustainable transformation of the energy industry and the global wind power development. Due to the fact that wind power generated intermittently and randomly is difficult to grid, the active power distribution network technology is considered to be an effective technical way for utilizing renewable wind power in a large scale, and one key technology is capable of analyzing and predicting the performance of the wind power. The accurate multi-time-step wind power prediction can improve the wind power utilization rate and the system reliability, reduce the operation cost and allow a flexible scheduling strategy. However, such multi-time step prediction is often a very important and difficult task, requiring advanced algorithmic solutions and tools with sufficient accuracy and acceptable computational complexity.
The wind power prediction method is a key factor influencing the accuracy of prediction. The artificial intelligence model fully excavates the mode among the available information variables in a data driving mode, and the prediction precision can be greatly improved. However, the available information is directly analyzed and predicted, the relation between the future wind power output and each factor cannot be fully excavated, information redundancy and noise inundation of important modes are generated, and the multi-time-step wind power prediction becomes a complex problem of multivariable high nonlinearity. And a mixed intelligent method for selecting input variables by using targeted characteristics is an effective solution to the problem.
Disclosure of Invention
The invention aims to provide a multi-time-step wind power prediction method based on dynamic characteristic selection aiming at the defects of the prior art, so as to solve the problem that wind power generation is difficult to predict under different time steps. . The technical scheme adopted by the invention is as follows:
step (1): aiming at different time step predictions, an original input variable set is constructed by historical power record data and public numerical weather forecast (NWP) information, and the original input variable set comprises characteristics of a power time series and weather elements such as air temperature, air pressure, humidity, wind speed and wind direction. The set of input variable characteristics for predicting the time step τ at the current time t is as follows:
wherein p (T-T), p (T-T +1),. once, p (T) is the average value of the time interval power of the historical time series, and T is the length of the time series. wdv (t + τ), trd (t + τ), temp (t + τ), hum (t + τ), aip (t + τ) are values of the predicted time period NWP, and are wind speed, wind direction, air temperature, humidity, and air pressure in this order. For ultra-short term prediction, a time resolution of 15 minutes is typically required to predict a 4-hour future wind power sequence, so the prediction step size may be taken to be τ -1, 2.
Step (2): in order to reduce the number of input variables of the ANFIS network, reduce the computational complexity and not influence the prediction precision, the characteristics of the data need to be extracted or selected, so that the data contain enough related characteristics, and the unrelated characteristics are reduced as much as possible. For the prediction of different time step lengths, effective information contribution values of all characteristic variables to prediction input are different, the invention provides a dynamic minimum redundancy maximum correlation (mRMR) characteristic selection algorithm for sequencing all attribute variables and calculating the accumulated information contribution rate of the attribute variables, thereby determining the attribute variables as model input. The method specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and non-linear correlations between variables, the concept of which is based on information entropy. Mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
where p (X, Y) is the joint probability distribution function of the random variables X and Y, and p (X) and p (Y) are the edge probability distribution functions of X and Y, respectively.
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion will consider both maximum correlation and minimum redundancy to identify a subset of input information variables. Good prediction performance may not be achieved due to the simple combination of the individual input information variables. Therefore, both the amount of information provided by a single input variable and the redundancy relationship between the variables should be taken into account. Under the mRMR criterion, a single input variable v is definediIs as in formula (3):
in the formula, V is a set of all candidate attribute variables, S is a set of selected input variables, and | · | is a variable quantity contained in the set. The first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viStrength of information correlation to the prediction process. The second item is to select those variables that have the least similarity or redundancy between the variables, thereby making the selected set more representative or providing information for the entire set.
(2.3) dynamically selecting and sequencing feature variables: selecting input variables by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2), and grading J (v) according to the information quantityiAnd S) selecting and sorting the candidate characteristic variables. In the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
the sorting selection of the remaining variables is obtained according to the size comparison of the information content scores contained in the formula (3): in step m, the remaining variable is V-Sm-1Wherein m is more than or equal to 2 and less than or equal to V, Sm-1Is an ordered set s of ordered variables in m-11,s2,...,sm-1Thus, the variables selected in step m are:
the algorithm dynamically searches forward one step each time an input variable is selected, so the process incrementally evaluates all candidate variables until step | V |, all attribute variables V are evaluated and ranked according to mRMR criteria.
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,V (6)
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxThe final input variables, which represent the feature attributes with the least redundant and most relevant, will be used for regression pattern mining of the subsequent machine learning algorithm.
And (3): taking the sample data containing the optimal characteristic variables obtained in the step (2) as a training set, and optimizing and adjusting network parameters of an Adaptive Neural Fuzzy Inference System (ANFIS), wherein the method specifically comprises the following substeps:
(3.1) determining the ANFIS network structure: and setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine the ANFIS network structure. Once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
for a classical two-input single-output network, an ANFIS network is constructed as follows: let the input be x1,x2The output is y, and for two fuzzy sets as an example, the inference rule is expressed as:
rule 1: if x1is A1and x2is B1,then y=p1x1+q1x2+r1;
Rule 2: if x1is A1and x2is B2,then y=p2x1+q2x2+r2;
Rule 3: if x1is A2and x2is B1,then y=p3x1+q3x2+r3;
Rule 4: if x1is A2and x2is B2,then y=p4x1+q4x2+r4;
The structure of each layer of the ANFIS is described as follows:
layer 1 (fuzzification): fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2),i,j=1,2 (8)
wherein A isi,BjFor the segmented fuzzy set, μ is the membership in the corresponding case, MF (-) is the membership function, and for the gaussian curve, as shown in equation (12), the parameter set { c, σ } constitutes the precondition parameters of ANFIS.
Layer 2 (rule inference): the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1)μBj(x2)n=1,2,3,4i,j=1,2 (10)
layer 3 (normalized): each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
layer 4 (deblurring): the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
fifth layer (output): this layer sums all the defuzzified neuronal outputs to arrive at the final output y of the ANFIS.
(3.2) optimizing ANFIS training parameters: and after determining the initial structure and parameters of the ANFIS, optimizing the precondition parameters for determining the shape of the membership function and the consequent parameters of the rule interpretation function. Performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor boot capacity, N is the number of sample data.
In the process of optimizing the ANFIS solution through a Particle Swarm Optimization (PSO), the method specifically comprises the following substeps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns;
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4, reserving the best position passed by each particle i, and marking the position as Pbesti=(Pi1,Pi2,...Pidim);
Step5, the next generation particle velocity and position are updated as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
And (4): performing prediction: at the current time t, aiming at the prediction tasks with different time steps tau, preparing an input vector containing candidate characteristics according to the step (1), and constructing the input vector according to the optimal input characteristics determined in the step (2). And (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
The invention has the beneficial effects that: aiming at the characteristic that wind power is difficult to predict in multiple time steps, (1) a hybrid prediction model is adopted to reasonably mine and analyze data, an internal rule is sought, and an effective model structure is designed; the effective intelligent technology of supervised learning and unsupervised learning is comprehensively utilized. (2) And (3) realizing input variable characteristic selection by applying a dynamic filtering algorithm of a minimum redundancy maximum correlation (mRMR) criterion to obtain sample data with compact subset attributes, and keeping the maximum correlation and filtering noise information as far as possible in the concentrated training sample. (3) By combining the meta-heuristic algorithm, the method optimizes and solves the precondition parameters and conclusion parameters of the ANFIS network, so that the network has effective generalization capability, and obtains a prediction result meeting the precision requirement.
Drawings
FIG. 1 is a schematic diagram of a typical two-input single-output ANFIS network architecture;
FIG. 2 is a table of cumulative information score during input variable selection:
(a) predicting the time step length to be 1 hour;
(b) predicting the time step length for 2 hours;
(c) predicting the time step length for 3 hours;
(d) predicting the time step length to be 4 hours;
FIG. 3 is a graph of the results of a test month partial prediction period.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention relates to a multi-time step wind power prediction method based on dynamic feature selection, and provides an intelligent hybrid model method for mining historical power time series and public numerical weather forecast (NWP) data by using a dynamic feature extraction algorithm, so as to solve the problem that wind power generation is difficult to predict at different time steps. The method adopts a minimum redundancy maximum correlation (mRMR) dynamic filtering method on the basis of available original data to automatically select input variables with different prediction step sizes; then, for input data with optimal characteristics, training and learning are carried out on the input information through an Adaptive Neural Fuzzy Inference System (ANFIS) and a metaheuristic optimization algorithm, and a more accurate prediction self-result is obtained. In order to illustrate the effect of the invention, the method of the invention is described in detail below by taking the actual data of a certain wind farm (installed capacity 16MW) in our country as the implementation object of the invention:
step (1): and constructing an original candidate feature vector set according to historical power data recorded from September to November in 2017 of the wind power plant and publicly acquired numerical weather forecast information. Research shows that wind speed at a certain time is generally related to historical wind speed of past 8 hours, and since the predicted step size is maximum 4h in the future and the time resolution is 15 minutes/step size, the historical power sequence is selected as the power value in past 4 hours, namely His (T) ([ p (T)), p (T-1),.. p (T-T) ], T is the current time, and T is 15. Therefore, 16 historical sequence attribute variables can be recorded as Hin1, Hin2, Hin 16. For numerical weather forecast, 5 attribute variables including wind speed, wind direction, air temperature, humidity, air pressure and the like at the forecast time are sequentially recorded as { WindV, WindD, Temp, Hum, AirP }. The set of input variable characteristics for predicting the time step τ at the current time t is as follows:
wherein p (T-T), p (T-T +1),.. and p (T) are time-interval power average values of a historical time sequence, wdv (T + τ), trd (T + τ), temp (T + τ), hum (T + τ), and aip (T + τ) are numerical values of a prediction time interval NWP, which are sequentially wind speed, wind direction, air temperature, humidity, air pressure, and τ 1, 2.. and 16.
Step (2): in order to reduce the number of input variables of the ANFIS network, reduce the computational complexity and not influence the prediction precision, the characteristics of the data need to be extracted or selected, so that the data contain enough related characteristics, and the unrelated characteristics are reduced as much as possible. For the prediction of different time step lengths, effective information contribution values of all characteristic variables to prediction input are different, the invention provides a dynamic minimum redundancy maximum correlation (mRMR) characteristic selection algorithm for sequencing all attribute variables and calculating the accumulated information contribution rate of the attribute variables, thereby determining the attribute variables as model input. The method specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and non-linear correlations between variables, the concept of which is based on information entropy. Mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
where p (X, Y) is the joint probability distribution function of the random variables X and Y, and p (X) and p (Y) are the edge probability distribution functions of X and Y, respectively.
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion will consider both maximum correlation and minimum redundancy to identify a subset of input information variables. Good prediction performance may not be achieved due to the simple combination of the individual input information variables. Therefore, both the amount of information provided by a single input variable and the redundancy relationship between the variables should be taken into account. Under the mRMR criterion, a single input variable v is definediIs as in formula (3):
in the formula, V is a set of all candidate attribute variables, S is a set of selected input variables, and | · | is a variable quantity contained in the set. The first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viStrength of information correlation to the prediction process. The second item is to select those variables that have the least similarity or redundancy between the variables, thereby making the selected set more representative or providing information for the entire set.
(2.3) dynamically selecting and sequencing feature variables: selecting input variables by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2), and grading J (v) according to the information quantityiAnd S) selecting and sorting the candidate characteristic variables. In the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
the selection (sorting) of the remaining variables is obtained according to the size comparison of the information content scores contained in formula (3): in step m (2. ltoreq. m. ltoreq. V.ltoreq.) the remaining variables are V-Sm-1In which S ism-1Is an ordered set s of selected (or ordered) variables in m-11,s2,...,sm-1Thus, the variables chosen in m are:
the algorithm dynamically searches forward one step each time an input variable is selected, so the process incrementally evaluates all candidate variables until step | V |, all attribute variables V are evaluated and ranked according to mRMR criteria.
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,|V| (6)
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxThe final input variables, which represent the feature attributes with the least redundant and most relevant, will be used for regression pattern mining of the subsequent machine learning algorithm.
And (3): taking the sample data containing the optimal characteristic variables obtained in the step (2) as a training set to optimize and adjust the network parameters of the Adaptive Neural Fuzzy Inference System (ANFIS), and specifically comprising the following substeps:
(3.1) determining the ANFIS network structure: and setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine the ANFIS network structure. Once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
for a classical two-input single-output network, an ANFIS network is constructed as follows: let the input be x1,x2The output is y, and for two fuzzy sets as an example, the inference rule is expressed as:
rule 1: if x1is A1and x2is B1,then y=p1x1+q1x2+r1;
Rule 2: if x1is A1and x2is B2,then y=p2x1+q2x2+r2;
Rule 3: if x1is A2and x2is B1,then y=p3x1+q3x2+r3;
Rule 4: if x1is A2and x2is B2,then y=p4x1+q4x2+r4;
The structure of each layer of the ANFIS is described as follows:
layer 1 (fuzzification): fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2),i,j=1,2 (8)
wherein A isi,BjFor the segmented fuzzy set, μ is the membership in the corresponding case, MF (-) is the membership function, and for the gaussian curve, as shown in equation (12), the parameter set { c, σ } constitutes the precondition parameters of ANFIS.
Layer 2 (rule inference): the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1)μBj(x2)n=1,2,3,4i,j=1,2 (10)
layer 3 (normalized): each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
layer 4 (deblurring): the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
fifth layer (output): this layer sums all the defuzzified neuronal outputs to arrive at the final output y of the ANFIS.
(3.2) optimizing ANFIS training parameters: and after determining the initial structure and parameters of the ANFIS, optimizing the precondition parameters for determining the shape of the membership function and the consequent parameters of the rule interpretation function. Performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor boot capacity, N is the number of sample data.
In the process of optimizing the ANFIS solution through a Particle Swarm Optimization (PSO), the method specifically comprises the following substeps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns;
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4 RetentionThe best position each particle i has experienced is denoted as Pbesti=(Pi1,Pi2,...Pidim);
Step5, the next generation particle velocity and position are updated as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
And (4): performing prediction: at the current time t, aiming at the prediction tasks with different time steps tau, preparing an input vector containing candidate characteristics according to the step (1), and constructing the input vector according to the optimal input characteristics determined in the step (2). And (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
For different time step prediction cases, namely tau is 1,2, 16, a dynamic input variable selection algorithm based on mRMR in the step (2) is applied to select candidate attribute features in the step (1), the features are sorted through a formula (5) in the step (2.3), and an information content score InSc of each feature is obtained through a formula (6) in the step (2.4)mAnd observing the cumulative score curve CumInScmThe trend of the change determines the final candidate variable, and part of the cumulative score curve of the predicted step size is shown in fig. 2. The characteristic candidate variables of the detailed different prediction step sizes and the sequence thereof are shown in table 1. The monthly mean square root error nRMSE of the model predictions for different prediction step sizes is shown in table 2. For the sake of intuition, FIG. 3 shows the comparison of the predicted curve and the actual curve for a portion of the time intervalThe method is described. The experimental result clearly shows that the prediction of the model on different time step lengths obtains satisfactory effect, particularly the root error of the root mean square of the month at the end of the fourth hour (#16) is less than 15%, and the prediction precision meets the actual needs of engineering.
TABLE 1 Attribute variables chosen for different prediction step sizes and their ordering
TABLE 2 monthly mean square root error for different prediction step lengths
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (3)
1. The multi-time step wind power prediction method based on dynamic feature selection is characterized by comprising the following steps:
step (1): aiming at different time step predictions, constructing an original input variable set by historical power record data and public numerical weather forecast (NWP) information, wherein the original input variable set comprises characteristics of a power time sequence and weather elements including air temperature, air pressure, humidity, wind speed and wind direction; the set of input variable characteristics for predicting the time step τ at the current time t is as follows:
wherein p (T-T), p (T-T +1),. once, p (T) is the time interval power average value of the historical time series, and T is the time series length; wdv (t + τ), trd (t + τ), temp (t + τ), hum (t + τ), aip (t + τ) are values of the prediction time period NWP, and are wind speed, wind direction, air temperature, humidity, and air pressure in sequence;
step (2) sequences the attribute variables and calculates the cumulative information contribution rate of the attribute variables, so as to determine the attribute variables as model input, and specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and nonlinear correlations between variables, the concept of which is based on information entropy; mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
wherein p (X, Y) is a joint probability distribution function of random variables X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion is to identify a subset of input information variables that take into account both maximum correlation and minimum redundancy, under the mRMR criterion, a single input variable v is definediIs as in formula (3):
in the formula, V is a set of all candidate attribute variables, S is a selected input variable set, and | is the variable quantity contained in the set; the first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viFor the information correlation strength of the prediction process, the second item is to select those variables having the least similarity or redundancy between the variables;
(2.3) dynamically selecting and sequencing feature variables: selecting input variable by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2)Scoring J (v) by information contentiS) selecting and sorting the candidate characteristic variables; in the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
the sorting selection of the remaining variables is obtained according to the size comparison of the information content scores contained in the formula (3): in step m, the remaining variable is V-Sm-1Wherein m is more than or equal to 2 and less than or equal to V, Sm-1Is an ordered set s of ordered variables in m-11,s2,...,sm-1Thus, the variables selected in step m are:
dynamically searching for one step forward every time an input variable algorithm is selected, so that the process progressively evaluates all candidate variables until step | V |, all attribute variables V are evaluated and sorted according to an mRMR criterion;
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,|V| (6)
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxFor the final input variable which represents the characteristic attribute with the minimum redundancy and the maximum correlation, the regression mode is mined for the subsequent machine learning algorithm;
and (3): the content obtained in the step (2) is maximumOrdered set of variables S of optimal characteristic variablesmaxAs a training set, the network parameters of the Adaptive Neural Fuzzy Inference System (ANFIS) are optimized and adjusted, and the method specifically comprises the following substeps:
(3.1) determining the ANFIS network structure: setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine an ANFIS network structure, wherein once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
(3.2) optimizing ANFIS training parameters: after determining an ANFIS initial structure and parameters, optimizing a precondition parameter for determining the shape of the membership function and a consequent parameter of the rule interpretation function; performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor the boot capacity, N is the number of sample data;
and (4): performing a prediction;
at the current time t, aiming at the prediction tasks with different time step lengths tau, preparing input vectors containing candidate characteristics according to the step (1), and constructing the input vectors according to the optimal input characteristics determined in the step (2); and (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
2. The method of claim 1, wherein the adaptive neuro-fuzzy inference system (ANFIS) in step (3) is structured as follows:
it has a double-input single-output structure, and the input is x1,x2Output y, ANFIS contains the following 5-layer structure:
layer 1: fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2), i,j=1,2 (9)
wherein A isi,BjFor the divided fuzzy set, mu is the membership under the corresponding condition, MF (-) is the membership function, and for the Gaussian curve, as shown in formula (12), the parameter set { c, sigma } forms the precondition parameter of ANFIS;
layer 2: the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1)μBj(x2)n=1,2,3,4 i,j=1,2 (11)
layer 3: each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
layer 4: the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
and a fifth layer: the layer sums all the defuzzified neuron outputs to obtain the final output y of the ANFIS;
3. the method according to claim 1, wherein the step (3.2) of optimizing the ANFIS solution by Particle Swarm Optimization (PSO) specifically comprises the following sub-steps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns;
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4 preserving the best position each particle i has experienced, denoted as Pbesti=(Pi1,Pi2,...Pidim);
Step5 next generation particle velocity and position update is as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
Step6:t=t+1, if T > TiterAfter the algorithm is finished, outputting a population global optimal solution Gbest as a final solution, and using an optimal solution space as an ANFIS optimal parameter; otherwise, go to Step 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910401140.2A CN110674965A (en) | 2019-05-15 | 2019-05-15 | Multi-time step wind power prediction method based on dynamic feature selection |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910401140.2A CN110674965A (en) | 2019-05-15 | 2019-05-15 | Multi-time step wind power prediction method based on dynamic feature selection |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110674965A true CN110674965A (en) | 2020-01-10 |
Family
ID=69068745
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910401140.2A Pending CN110674965A (en) | 2019-05-15 | 2019-05-15 | Multi-time step wind power prediction method based on dynamic feature selection |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110674965A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111639437A (en) * | 2020-06-08 | 2020-09-08 | 中国水利水电科学研究院 | Method for dynamically changing WRF mode parameterization scheme combination based on ground air pressure distribution situation |
CN113239620A (en) * | 2021-05-11 | 2021-08-10 | 中国电建集团华东勘测设计研究院有限公司 | Improved particle swarm method for parameter identification of geotechnical material constitutive model based on GPU acceleration |
CN113404655A (en) * | 2020-08-24 | 2021-09-17 | 湖南科技大学 | Wind driven generator sensor state diagnosis system based on PS0-ANFIS |
CN113468794A (en) * | 2020-12-29 | 2021-10-01 | 重庆大学 | Temperature and humidity prediction and reverse optimization method for small-sized closed space |
CN113743652A (en) * | 2021-08-06 | 2021-12-03 | 广西大学 | Sugarcane squeezing process prediction method based on depth feature recognition |
CN114169252A (en) * | 2021-12-27 | 2022-03-11 | 广东工业大学 | Short-term region wind power prediction method for dynamically selecting representative wind power plant |
CN114202065A (en) * | 2022-02-17 | 2022-03-18 | 之江实验室 | Stream data prediction method and device based on incremental evolution LSTM |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102102626A (en) * | 2011-01-30 | 2011-06-22 | 华北电力大学 | Method for forecasting short-term power in wind power station |
CN104933489A (en) * | 2015-06-29 | 2015-09-23 | 东北电力大学 | Wind power real-time high precision prediction method based on adaptive neuro-fuzzy inference system |
CN108832663A (en) * | 2018-07-18 | 2018-11-16 | 北京天诚同创电气有限公司 | The prediction technique and equipment of the generated output of micro-capacitance sensor photovoltaic generating system |
CN109255726A (en) * | 2018-09-07 | 2019-01-22 | 中国电建集团华东勘测设计研究院有限公司 | A kind of ultra-short term wind power prediction method of Hybrid Intelligent Technology |
-
2019
- 2019-05-15 CN CN201910401140.2A patent/CN110674965A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102102626A (en) * | 2011-01-30 | 2011-06-22 | 华北电力大学 | Method for forecasting short-term power in wind power station |
CN104933489A (en) * | 2015-06-29 | 2015-09-23 | 东北电力大学 | Wind power real-time high precision prediction method based on adaptive neuro-fuzzy inference system |
CN108832663A (en) * | 2018-07-18 | 2018-11-16 | 北京天诚同创电气有限公司 | The prediction technique and equipment of the generated output of micro-capacitance sensor photovoltaic generating system |
CN109255726A (en) * | 2018-09-07 | 2019-01-22 | 中国电建集团华东勘测设计研究院有限公司 | A kind of ultra-short term wind power prediction method of Hybrid Intelligent Technology |
Non-Patent Citations (1)
Title |
---|
董伟等: "基于混合智能模型的分布式风力发电预测方法", 《供用电》 * |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111639437A (en) * | 2020-06-08 | 2020-09-08 | 中国水利水电科学研究院 | Method for dynamically changing WRF mode parameterization scheme combination based on ground air pressure distribution situation |
CN113404655A (en) * | 2020-08-24 | 2021-09-17 | 湖南科技大学 | Wind driven generator sensor state diagnosis system based on PS0-ANFIS |
CN113468794A (en) * | 2020-12-29 | 2021-10-01 | 重庆大学 | Temperature and humidity prediction and reverse optimization method for small-sized closed space |
CN113239620A (en) * | 2021-05-11 | 2021-08-10 | 中国电建集团华东勘测设计研究院有限公司 | Improved particle swarm method for parameter identification of geotechnical material constitutive model based on GPU acceleration |
CN113743652A (en) * | 2021-08-06 | 2021-12-03 | 广西大学 | Sugarcane squeezing process prediction method based on depth feature recognition |
CN113743652B (en) * | 2021-08-06 | 2022-03-11 | 广西大学 | Sugarcane squeezing process prediction method based on depth feature recognition |
CN114169252A (en) * | 2021-12-27 | 2022-03-11 | 广东工业大学 | Short-term region wind power prediction method for dynamically selecting representative wind power plant |
CN114169252B (en) * | 2021-12-27 | 2022-11-29 | 广东工业大学 | Short-term region wind power prediction method for dynamically selecting representative wind power plant |
CN114202065A (en) * | 2022-02-17 | 2022-03-18 | 之江实验室 | Stream data prediction method and device based on incremental evolution LSTM |
CN114202065B (en) * | 2022-02-17 | 2022-06-24 | 之江实验室 | Stream data prediction method and device based on incremental evolution LSTM |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110674965A (en) | Multi-time step wind power prediction method based on dynamic feature selection | |
Wang et al. | Adaptive learning hybrid model for solar intensity forecasting | |
Abraham et al. | A neuro-fuzzy approach for modelling electricity demand in Victoria | |
CN112116144B (en) | Regional power distribution network short-term load prediction method | |
CN113705877B (en) | Real-time moon runoff forecasting method based on deep learning model | |
CN114399032B (en) | Method and system for predicting metering error of electric energy meter | |
CN116526473A (en) | Particle swarm optimization LSTM-based electrothermal load prediction method | |
Do et al. | Forecasting Vietnamese stock index: A comparison of hierarchical ANFIS and LSTM | |
CN113537539B (en) | Multi-time-step heat and gas consumption prediction model based on attention mechanism | |
Lemke et al. | Self-organizing data mining for a portfolio trading system | |
CN113282747A (en) | Text classification method based on automatic machine learning algorithm selection | |
Aliev et al. | Fuzzy time series prediction method based on fuzzy recurrent neural network | |
CN117439053A (en) | Method, device and storage medium for predicting electric quantity of Stacking integrated model | |
CN111563614A (en) | Load prediction method based on adaptive neural network and TLBO algorithm | |
CN115660219A (en) | Short-term power load prediction method based on HSNS-BP | |
Chen et al. | A new method for fuzzy forecasting based on two-factors high-order fuzzy-trend logical relationship groups and particle swarm optimization techniques | |
CN113191526A (en) | Short-term wind speed interval multi-objective optimization prediction method and system based on random sensitivity | |
Cao et al. | Research On Regional Traffic Flow Prediction Based On MGCN-WOALSTM | |
CN112785022A (en) | Method and system for excavating electric energy substitution potential | |
He et al. | Application of neural network model based on combination of fuzzy classification and input selection in short term load forecasting | |
Choudhary et al. | Estimation of wind power using different soft computing methods | |
Lin et al. | An efficient evolutionary algorithm for fuzzy inference systems | |
Kashef | Forecasting The Price of the Flight Tickets using A Novel Hybrid Learning model | |
CN115034426B (en) | Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode | |
CN113705932B (en) | Short-term load prediction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200110 |
|
RJ01 | Rejection of invention patent application after publication |