CN110674965A - Multi-time step wind power prediction method based on dynamic feature selection - Google Patents

Multi-time step wind power prediction method based on dynamic feature selection Download PDF

Info

Publication number
CN110674965A
CN110674965A CN201910401140.2A CN201910401140A CN110674965A CN 110674965 A CN110674965 A CN 110674965A CN 201910401140 A CN201910401140 A CN 201910401140A CN 110674965 A CN110674965 A CN 110674965A
Authority
CN
China
Prior art keywords
variables
anfis
input
variable
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910401140.2A
Other languages
Chinese (zh)
Inventor
房新力
杨强
富强
邬雪松
程开宇
董伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PowerChina Huadong Engineering Corp Ltd
Original Assignee
PowerChina Huadong Engineering Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PowerChina Huadong Engineering Corp Ltd filed Critical PowerChina Huadong Engineering Corp Ltd
Priority to CN201910401140.2A priority Critical patent/CN110674965A/en
Publication of CN110674965A publication Critical patent/CN110674965A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/043Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models
    • G06N5/048Fuzzy inferencing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Automation & Control Theory (AREA)
  • Fuzzy Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Water Supply & Treatment (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a multi-time step wind power prediction method based on dynamic feature selection. The method provides an intelligent hybrid model method for mining historical power time series and public numerical weather forecast (NWP) data by using a dynamic feature extraction algorithm, so as to solve the problem that wind power generation is difficult to predict under different time steps. The method adopts a minimum redundancy maximum correlation (mRMR) dynamic filtering method on the basis of available original data to automatically select input variables with different prediction step sizes; then, carrying out supervised learning on the input data with the optimal characteristics through an Adaptive Neural Fuzzy Inference System (ANFIS); the model trains ANFIS parameters by applying a Particle Swarm Optimization (PSO) algorithm to achieve an optimal prediction effect; and finally, evaluating the proposed hybrid intelligent model through the operation data of the actual distributed wind turbine generator, and verifying the effectiveness of the model through experimental results.

Description

Multi-time step wind power prediction method based on dynamic feature selection
Technical Field
The invention relates to the field of wind power prediction of multiple time steps of new energy power generation, in particular to a multi-time step wind power prediction method based on dynamic feature selection.
Background
In recent years, the urgent need of low-carbon economy and the progress of wind power technology are promoting the rapid sustainable transformation of the energy industry and the global wind power development. Due to the fact that wind power generated intermittently and randomly is difficult to grid, the active power distribution network technology is considered to be an effective technical way for utilizing renewable wind power in a large scale, and one key technology is capable of analyzing and predicting the performance of the wind power. The accurate multi-time-step wind power prediction can improve the wind power utilization rate and the system reliability, reduce the operation cost and allow a flexible scheduling strategy. However, such multi-time step prediction is often a very important and difficult task, requiring advanced algorithmic solutions and tools with sufficient accuracy and acceptable computational complexity.
The wind power prediction method is a key factor influencing the accuracy of prediction. The artificial intelligence model fully excavates the mode among the available information variables in a data driving mode, and the prediction precision can be greatly improved. However, the available information is directly analyzed and predicted, the relation between the future wind power output and each factor cannot be fully excavated, information redundancy and noise inundation of important modes are generated, and the multi-time-step wind power prediction becomes a complex problem of multivariable high nonlinearity. And a mixed intelligent method for selecting input variables by using targeted characteristics is an effective solution to the problem.
Disclosure of Invention
The invention aims to provide a multi-time-step wind power prediction method based on dynamic characteristic selection aiming at the defects of the prior art, so as to solve the problem that wind power generation is difficult to predict under different time steps. . The technical scheme adopted by the invention is as follows:
step (1): aiming at different time step predictions, an original input variable set is constructed by historical power record data and public numerical weather forecast (NWP) information, and the original input variable set comprises characteristics of a power time series and weather elements such as air temperature, air pressure, humidity, wind speed and wind direction. The set of input variable characteristics for predicting the time step τ at the current time t is as follows:
Figure BDA0002059820160000021
wherein p (T-T), p (T-T +1),. once, p (T) is the average value of the time interval power of the historical time series, and T is the length of the time series. wdv (t + τ), trd (t + τ), temp (t + τ), hum (t + τ), aip (t + τ) are values of the predicted time period NWP, and are wind speed, wind direction, air temperature, humidity, and air pressure in this order. For ultra-short term prediction, a time resolution of 15 minutes is typically required to predict a 4-hour future wind power sequence, so the prediction step size may be taken to be τ -1, 2.
Step (2): in order to reduce the number of input variables of the ANFIS network, reduce the computational complexity and not influence the prediction precision, the characteristics of the data need to be extracted or selected, so that the data contain enough related characteristics, and the unrelated characteristics are reduced as much as possible. For the prediction of different time step lengths, effective information contribution values of all characteristic variables to prediction input are different, the invention provides a dynamic minimum redundancy maximum correlation (mRMR) characteristic selection algorithm for sequencing all attribute variables and calculating the accumulated information contribution rate of the attribute variables, thereby determining the attribute variables as model input. The method specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and non-linear correlations between variables, the concept of which is based on information entropy. Mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
Figure BDA0002059820160000022
where p (X, Y) is the joint probability distribution function of the random variables X and Y, and p (X) and p (Y) are the edge probability distribution functions of X and Y, respectively.
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion will consider both maximum correlation and minimum redundancy to identify a subset of input information variables. Good prediction performance may not be achieved due to the simple combination of the individual input information variables. Therefore, both the amount of information provided by a single input variable and the redundancy relationship between the variables should be taken into account. Under the mRMR criterion, a single input variable v is definediIs as in formula (3):
in the formula, V is a set of all candidate attribute variables, S is a set of selected input variables, and | · | is a variable quantity contained in the set. The first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viStrength of information correlation to the prediction process. The second item is to select those variables that have the least similarity or redundancy between the variables, thereby making the selected set more representative or providing information for the entire set.
(2.3) dynamically selecting and sequencing feature variables: selecting input variables by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2), and grading J (v) according to the information quantityiAnd S) selecting and sorting the candidate characteristic variables. In the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
Figure BDA0002059820160000032
the sorting selection of the remaining variables is obtained according to the size comparison of the information content scores contained in the formula (3): in step m, the remaining variable is V-Sm-1Wherein m is more than or equal to 2 and less than or equal to V, Sm-1Is an ordered set s of ordered variables in m-11,s2,...,sm-1Thus, the variables selected in step m are:
the algorithm dynamically searches forward one step each time an input variable is selected, so the process incrementally evaluates all candidate variables until step | V |, all attribute variables V are evaluated and ranked according to mRMR criteria.
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,V (6)
Figure BDA0002059820160000041
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxThe final input variables, which represent the feature attributes with the least redundant and most relevant, will be used for regression pattern mining of the subsequent machine learning algorithm.
And (3): taking the sample data containing the optimal characteristic variables obtained in the step (2) as a training set, and optimizing and adjusting network parameters of an Adaptive Neural Fuzzy Inference System (ANFIS), wherein the method specifically comprises the following substeps:
(3.1) determining the ANFIS network structure: and setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine the ANFIS network structure. Once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
for a classical two-input single-output network, an ANFIS network is constructed as follows: let the input be x1,x2The output is y, and for two fuzzy sets as an example, the inference rule is expressed as:
rule 1: if x1is A1and x2is B1,then y=p1x1+q1x2+r1
Rule 2: if x1is A1and x2is B2,then y=p2x1+q2x2+r2
Rule 3: if x1is A2and x2is B1,then y=p3x1+q3x2+r3
Rule 4: if x1is A2and x2is B2,then y=p4x1+q4x2+r4
The structure of each layer of the ANFIS is described as follows:
layer 1 (fuzzification): fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2),i,j=1,2 (8)
wherein A isi,BjFor the segmented fuzzy set, μ is the membership in the corresponding case, MF (-) is the membership function, and for the gaussian curve, as shown in equation (12), the parameter set { c, σ } constitutes the precondition parameters of ANFIS.
Figure BDA0002059820160000042
Layer 2 (rule inference): the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1Bj(x2)n=1,2,3,4i,j=1,2 (10)
layer 3 (normalized): each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
Figure BDA0002059820160000051
layer 4 (deblurring): the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
Figure BDA0002059820160000052
fifth layer (output): this layer sums all the defuzzified neuronal outputs to arrive at the final output y of the ANFIS.
Figure BDA0002059820160000053
(3.2) optimizing ANFIS training parameters: and after determining the initial structure and parameters of the ANFIS, optimizing the precondition parameters for determining the shape of the membership function and the consequent parameters of the rule interpretation function. Performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
Figure BDA0002059820160000054
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor boot capacity, N is the number of sample data.
In the process of optimizing the ANFIS solution through a Particle Swarm Optimization (PSO), the method specifically comprises the following substeps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4, reserving the best position passed by each particle i, and marking the position as Pbesti=(Pi1,Pi2,...Pidim);
Step5, the next generation particle velocity and position are updated as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
step 6, T is T +1, if T > TiterAfter the algorithm is finished, outputting a population global optimal solution Gbest as a final solution, and using an optimal solution space as an ANFIS optimal parameter; otherwise, go toStep 2。
And (4): performing prediction: at the current time t, aiming at the prediction tasks with different time steps tau, preparing an input vector containing candidate characteristics according to the step (1), and constructing the input vector according to the optimal input characteristics determined in the step (2). And (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
The invention has the beneficial effects that: aiming at the characteristic that wind power is difficult to predict in multiple time steps, (1) a hybrid prediction model is adopted to reasonably mine and analyze data, an internal rule is sought, and an effective model structure is designed; the effective intelligent technology of supervised learning and unsupervised learning is comprehensively utilized. (2) And (3) realizing input variable characteristic selection by applying a dynamic filtering algorithm of a minimum redundancy maximum correlation (mRMR) criterion to obtain sample data with compact subset attributes, and keeping the maximum correlation and filtering noise information as far as possible in the concentrated training sample. (3) By combining the meta-heuristic algorithm, the method optimizes and solves the precondition parameters and conclusion parameters of the ANFIS network, so that the network has effective generalization capability, and obtains a prediction result meeting the precision requirement.
Drawings
FIG. 1 is a schematic diagram of a typical two-input single-output ANFIS network architecture;
FIG. 2 is a table of cumulative information score during input variable selection:
(a) predicting the time step length to be 1 hour;
(b) predicting the time step length for 2 hours;
(c) predicting the time step length for 3 hours;
(d) predicting the time step length to be 4 hours;
FIG. 3 is a graph of the results of a test month partial prediction period.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
The invention relates to a multi-time step wind power prediction method based on dynamic feature selection, and provides an intelligent hybrid model method for mining historical power time series and public numerical weather forecast (NWP) data by using a dynamic feature extraction algorithm, so as to solve the problem that wind power generation is difficult to predict at different time steps. The method adopts a minimum redundancy maximum correlation (mRMR) dynamic filtering method on the basis of available original data to automatically select input variables with different prediction step sizes; then, for input data with optimal characteristics, training and learning are carried out on the input information through an Adaptive Neural Fuzzy Inference System (ANFIS) and a metaheuristic optimization algorithm, and a more accurate prediction self-result is obtained. In order to illustrate the effect of the invention, the method of the invention is described in detail below by taking the actual data of a certain wind farm (installed capacity 16MW) in our country as the implementation object of the invention:
step (1): and constructing an original candidate feature vector set according to historical power data recorded from September to November in 2017 of the wind power plant and publicly acquired numerical weather forecast information. Research shows that wind speed at a certain time is generally related to historical wind speed of past 8 hours, and since the predicted step size is maximum 4h in the future and the time resolution is 15 minutes/step size, the historical power sequence is selected as the power value in past 4 hours, namely His (T) ([ p (T)), p (T-1),.. p (T-T) ], T is the current time, and T is 15. Therefore, 16 historical sequence attribute variables can be recorded as Hin1, Hin2, Hin 16. For numerical weather forecast, 5 attribute variables including wind speed, wind direction, air temperature, humidity, air pressure and the like at the forecast time are sequentially recorded as { WindV, WindD, Temp, Hum, AirP }. The set of input variable characteristics for predicting the time step τ at the current time t is as follows:
Figure BDA0002059820160000081
wherein p (T-T), p (T-T +1),.. and p (T) are time-interval power average values of a historical time sequence, wdv (T + τ), trd (T + τ), temp (T + τ), hum (T + τ), and aip (T + τ) are numerical values of a prediction time interval NWP, which are sequentially wind speed, wind direction, air temperature, humidity, air pressure, and τ 1, 2.. and 16.
Step (2): in order to reduce the number of input variables of the ANFIS network, reduce the computational complexity and not influence the prediction precision, the characteristics of the data need to be extracted or selected, so that the data contain enough related characteristics, and the unrelated characteristics are reduced as much as possible. For the prediction of different time step lengths, effective information contribution values of all characteristic variables to prediction input are different, the invention provides a dynamic minimum redundancy maximum correlation (mRMR) characteristic selection algorithm for sequencing all attribute variables and calculating the accumulated information contribution rate of the attribute variables, thereby determining the attribute variables as model input. The method specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and non-linear correlations between variables, the concept of which is based on information entropy. Mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
Figure BDA0002059820160000082
where p (X, Y) is the joint probability distribution function of the random variables X and Y, and p (X) and p (Y) are the edge probability distribution functions of X and Y, respectively.
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion will consider both maximum correlation and minimum redundancy to identify a subset of input information variables. Good prediction performance may not be achieved due to the simple combination of the individual input information variables. Therefore, both the amount of information provided by a single input variable and the redundancy relationship between the variables should be taken into account. Under the mRMR criterion, a single input variable v is definediIs as in formula (3):
Figure BDA0002059820160000091
in the formula, V is a set of all candidate attribute variables, S is a set of selected input variables, and | · | is a variable quantity contained in the set. The first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viStrength of information correlation to the prediction process. The second item is to select those variables that have the least similarity or redundancy between the variables, thereby making the selected set more representative or providing information for the entire set.
(2.3) dynamically selecting and sequencing feature variables: selecting input variables by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2), and grading J (v) according to the information quantityiAnd S) selecting and sorting the candidate characteristic variables. In the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
Figure BDA0002059820160000092
the selection (sorting) of the remaining variables is obtained according to the size comparison of the information content scores contained in formula (3): in step m (2. ltoreq. m. ltoreq. V.ltoreq.) the remaining variables are V-Sm-1In which S ism-1Is an ordered set s of selected (or ordered) variables in m-11,s2,...,sm-1Thus, the variables chosen in m are:
Figure BDA0002059820160000093
the algorithm dynamically searches forward one step each time an input variable is selected, so the process incrementally evaluates all candidate variables until step | V |, all attribute variables V are evaluated and ranked according to mRMR criteria.
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,|V| (6)
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxThe final input variables, which represent the feature attributes with the least redundant and most relevant, will be used for regression pattern mining of the subsequent machine learning algorithm.
And (3): taking the sample data containing the optimal characteristic variables obtained in the step (2) as a training set to optimize and adjust the network parameters of the Adaptive Neural Fuzzy Inference System (ANFIS), and specifically comprising the following substeps:
(3.1) determining the ANFIS network structure: and setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine the ANFIS network structure. Once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
for a classical two-input single-output network, an ANFIS network is constructed as follows: let the input be x1,x2The output is y, and for two fuzzy sets as an example, the inference rule is expressed as:
rule 1: if x1is A1and x2is B1,then y=p1x1+q1x2+r1
Rule 2: if x1is A1and x2is B2,then y=p2x1+q2x2+r2
Rule 3: if x1is A2and x2is B1,then y=p3x1+q3x2+r3
Rule 4: if x1is A2and x2is B2,then y=p4x1+q4x2+r4
The structure of each layer of the ANFIS is described as follows:
layer 1 (fuzzification): fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2),i,j=1,2 (8)
wherein A isi,BjFor the segmented fuzzy set, μ is the membership in the corresponding case, MF (-) is the membership function, and for the gaussian curve, as shown in equation (12), the parameter set { c, σ } constitutes the precondition parameters of ANFIS.
Figure BDA0002059820160000111
Layer 2 (rule inference): the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1Bj(x2)n=1,2,3,4i,j=1,2 (10)
layer 3 (normalized): each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
Figure BDA0002059820160000112
layer 4 (deblurring): the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
Figure BDA0002059820160000113
fifth layer (output): this layer sums all the defuzzified neuronal outputs to arrive at the final output y of the ANFIS.
Figure BDA0002059820160000114
(3.2) optimizing ANFIS training parameters: and after determining the initial structure and parameters of the ANFIS, optimizing the precondition parameters for determining the shape of the membership function and the consequent parameters of the rule interpretation function. Performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
Figure BDA0002059820160000121
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor boot capacity, N is the number of sample data.
In the process of optimizing the ANFIS solution through a Particle Swarm Optimization (PSO), the method specifically comprises the following substeps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4 RetentionThe best position each particle i has experienced is denoted as Pbesti=(Pi1,Pi2,...Pidim);
Step5, the next generation particle velocity and position are updated as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
step 6, T is T +1, if T > TiterAfter the algorithm is finished, outputting a population global optimal solution Gbest as a final solution, and using an optimal solution space as an ANFIS optimal parameter; otherwise, go to Step 2.
And (4): performing prediction: at the current time t, aiming at the prediction tasks with different time steps tau, preparing an input vector containing candidate characteristics according to the step (1), and constructing the input vector according to the optimal input characteristics determined in the step (2). And (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
For different time step prediction cases, namely tau is 1,2, 16, a dynamic input variable selection algorithm based on mRMR in the step (2) is applied to select candidate attribute features in the step (1), the features are sorted through a formula (5) in the step (2.3), and an information content score InSc of each feature is obtained through a formula (6) in the step (2.4)mAnd observing the cumulative score curve CumInScmThe trend of the change determines the final candidate variable, and part of the cumulative score curve of the predicted step size is shown in fig. 2. The characteristic candidate variables of the detailed different prediction step sizes and the sequence thereof are shown in table 1. The monthly mean square root error nRMSE of the model predictions for different prediction step sizes is shown in table 2. For the sake of intuition, FIG. 3 shows the comparison of the predicted curve and the actual curve for a portion of the time intervalThe method is described. The experimental result clearly shows that the prediction of the model on different time step lengths obtains satisfactory effect, particularly the root error of the root mean square of the month at the end of the fourth hour (#16) is less than 15%, and the prediction precision meets the actual needs of engineering.
TABLE 1 Attribute variables chosen for different prediction step sizes and their ordering
Figure BDA0002059820160000131
Figure BDA0002059820160000141
TABLE 2 monthly mean square root error for different prediction step lengths
Figure BDA0002059820160000142
It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (3)

1. The multi-time step wind power prediction method based on dynamic feature selection is characterized by comprising the following steps:
step (1): aiming at different time step predictions, constructing an original input variable set by historical power record data and public numerical weather forecast (NWP) information, wherein the original input variable set comprises characteristics of a power time sequence and weather elements including air temperature, air pressure, humidity, wind speed and wind direction; the set of input variable characteristics for predicting the time step τ at the current time t is as follows:
Figure FDA0002059820150000011
wherein p (T-T), p (T-T +1),. once, p (T) is the time interval power average value of the historical time series, and T is the time series length; wdv (t + τ), trd (t + τ), temp (t + τ), hum (t + τ), aip (t + τ) are values of the prediction time period NWP, and are wind speed, wind direction, air temperature, humidity, and air pressure in sequence;
step (2) sequences the attribute variables and calculates the cumulative information contribution rate of the attribute variables, so as to determine the attribute variables as model input, and specifically comprises the following substeps:
(2.1) calculating mutual information among the characteristic variables: mutual information is used to measure linear and nonlinear correlations between variables, the concept of which is based on information entropy; mutual information I (X; Y) between any two discrete random variables X and Y is defined as equation (2), which represents the amount of information provided by the variable X in reducing the uncertainty of the variable Y:
Figure FDA0002059820150000012
wherein p (X, Y) is a joint probability distribution function of random variables X and Y, and p (X) and p (Y) are edge probability distribution functions of X and Y, respectively;
(2.2) calculating a characteristic information quantity score: the minimum redundancy maximum correlation (mRMR) criterion is to identify a subset of input information variables that take into account both maximum correlation and minimum redundancy, under the mRMR criterion, a single input variable v is definediIs as in formula (3):
Figure FDA0002059820150000021
in the formula, V is a set of all candidate attribute variables, S is a selected input variable set, and | is the variable quantity contained in the set; the first term on the right of the equation is at the target t and the candidate input variable viUses mutual information I (v) betweeni(ii) a t) to measure the candidate variable viFor the information correlation strength of the prediction process, the second item is to select those variables having the least similarity or redundancy between the variables;
(2.3) dynamically selecting and sequencing feature variables: selecting input variable by adopting a method of gradually searching incremental forward according to the formula (3) in the step (2.2)Scoring J (v) by information contentiS) selecting and sorting the candidate characteristic variables; in the first step, the maximum correlation component of all candidate attribute variables, namely, the mutual information with the prediction target is calculated, wherein the variable with the maximum mutual information is determined as the first input variable:
Figure FDA0002059820150000022
the sorting selection of the remaining variables is obtained according to the size comparison of the information content scores contained in the formula (3): in step m, the remaining variable is V-Sm-1Wherein m is more than or equal to 2 and less than or equal to V, Sm-1Is an ordered set s of ordered variables in m-11,s2,...,sm-1Thus, the variables selected in step m are:
Figure FDA0002059820150000023
dynamically searching for one step forward every time an input variable algorithm is selected, so that the process progressively evaluates all candidate variables until step | V |, all attribute variables V are evaluated and sorted according to an mRMR criterion;
(2.4): determining a final filtering characteristic variable: calculating the information content score InSc of each variable according to the sorted features obtained in the step (2.3)mAnd cumulative information contribution ratio CumInScmAs shown in formula (6-7):
InScm=J(sm,Sm-1)m=2,3,...,|V| (6)
Figure FDA0002059820150000024
when CumInScmWhen the maximum value is reached, the ordered variable set S is correspondingly formedmaxFor the final input variable which represents the characteristic attribute with the minimum redundancy and the maximum correlation, the regression mode is mined for the subsequent machine learning algorithm;
and (3): the content obtained in the step (2) is maximumOrdered set of variables S of optimal characteristic variablesmaxAs a training set, the network parameters of the Adaptive Neural Fuzzy Inference System (ANFIS) are optimized and adjusted, and the method specifically comprises the following substeps:
(3.1) determining the ANFIS network structure: setting the membership function shape type and the fuzzy set number of an Adaptive Neural Fuzzy Inference System (ANFIS) so as to determine an ANFIS network structure, wherein once the network structure is determined, training parameters are also determined, including precondition parameters and conclusion parameters;
(3.2) optimizing ANFIS training parameters: after determining an ANFIS initial structure and parameters, optimizing a precondition parameter for determining the shape of the membership function and a consequent parameter of the rule interpretation function; performing heuristic search on the network training parameters by adopting a Particle Swarm Optimization (PSO) algorithm to ensure that the mean square error root error of the sample data reaches the minimum value, wherein an objective function is as follows:
wherein, PoiFor output values, P, of ANFIS network when inputting i samplesriFor the actual measurement of the corresponding sample, GiFor the boot capacity, N is the number of sample data;
and (4): performing a prediction;
at the current time t, aiming at the prediction tasks with different time step lengths tau, preparing input vectors containing candidate characteristics according to the step (1), and constructing the input vectors according to the optimal input characteristics determined in the step (2); and (4) after the input vector is determined, inputting the input vector into the ANFIS trained in the step (3), and taking the obtained output result as a predicted value under the current condition.
2. The method of claim 1, wherein the adaptive neuro-fuzzy inference system (ANFIS) in step (3) is structured as follows:
it has a double-input single-output structure, and the input is x1,x2Output y, ANFIS contains the following 5-layer structure:
layer 1: fuzzifying the input variable, and determining the membership degree of the input variable in each fuzzy set by using a membership function;
μAi(x1)=MF(x1),μBj(x2)=MF(x2), i,j=1,2 (9)
wherein A isi,BjFor the divided fuzzy set, mu is the membership under the corresponding condition, MF (-) is the membership function, and for the Gaussian curve, as shown in formula (12), the parameter set { c, sigma } forms the precondition parameter of ANFIS;
Figure FDA0002059820150000041
layer 2: the regular neurons receive input from respective fuzzification neurons, and calculate activation weights under respective inference rules;
ωn=μAi(x1Bj(x2)n=1,2,3,4 i,j=1,2 (11)
layer 3: each neuron of the layer receives all neuron inputs from the previous layer and normalizes the activation weight under each rule;
Figure FDA0002059820150000042
layer 4: the layer utilizes a rule interpretation function to defuzzify the activation weight under each rule, and calculates a given rule fnWeighted consequent values of (a). Parameter set p of the linear interpretation function usedn,qn,rnForming a conclusion parameter of ANFIS;
Figure FDA0002059820150000043
and a fifth layer: the layer sums all the defuzzified neuron outputs to obtain the final output y of the ANFIS;
Figure FDA0002059820150000044
3. the method according to claim 1, wherein the step (3.2) of optimizing the ANFIS solution by Particle Swarm Optimization (PSO) specifically comprises the following sub-steps:
step 1: PSO algorithm initialization: setting algorithm parameters including population size NsTotal number of iterations TiterMaximum velocity vmaxLearning factor c1,c2(ii) a Initializing a population position xi=(xi1,xi2,...xidim),i=1,2...,NsWherein dim is the number of ANFIS parameters and the position vector x of each particleiRepresents a set of ANFIS parameters; initializing a velocity vector vi=(vi1,vi2,...vidim),vid∈[-vmax,vmax](ii) a Let initial iteration time t be 0 and initial inertia factor w0=1;
Step 2: calculate the fitness value Fit [ i ] of each particle]=f(xi),i=1,2,...,Ns
Step 3: and selecting a global optimal solution from the population according to the fitness value, and recording the solution as Gbest (P)g1,Pg2,...Pgdim);
Step4 preserving the best position each particle i has experienced, denoted as Pbesti=(Pi1,Pi2,...Pidim);
Step5 next generation particle velocity and position update is as follows:
vid(t+1)=vid(t)+c1r1(Pid-xid)+c2r2(Pgd-xid) (15)
xid(t+1)=xid(t)+w·vid(t+1) (16)
i=1,2...,Nsd=1,2...,dim
wherein r is1,r2Is a (0,1) random number, w is a parameter w which is linearly decreased from 1 to 0 along with the iteration number, and is 1-T/T;
Step6:t=t+1, if T > TiterAfter the algorithm is finished, outputting a population global optimal solution Gbest as a final solution, and using an optimal solution space as an ANFIS optimal parameter; otherwise, go to Step 2.
CN201910401140.2A 2019-05-15 2019-05-15 Multi-time step wind power prediction method based on dynamic feature selection Pending CN110674965A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910401140.2A CN110674965A (en) 2019-05-15 2019-05-15 Multi-time step wind power prediction method based on dynamic feature selection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910401140.2A CN110674965A (en) 2019-05-15 2019-05-15 Multi-time step wind power prediction method based on dynamic feature selection

Publications (1)

Publication Number Publication Date
CN110674965A true CN110674965A (en) 2020-01-10

Family

ID=69068745

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910401140.2A Pending CN110674965A (en) 2019-05-15 2019-05-15 Multi-time step wind power prediction method based on dynamic feature selection

Country Status (1)

Country Link
CN (1) CN110674965A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639437A (en) * 2020-06-08 2020-09-08 中国水利水电科学研究院 Method for dynamically changing WRF mode parameterization scheme combination based on ground air pressure distribution situation
CN113239620A (en) * 2021-05-11 2021-08-10 中国电建集团华东勘测设计研究院有限公司 Improved particle swarm method for parameter identification of geotechnical material constitutive model based on GPU acceleration
CN113404655A (en) * 2020-08-24 2021-09-17 湖南科技大学 Wind driven generator sensor state diagnosis system based on PS0-ANFIS
CN113468794A (en) * 2020-12-29 2021-10-01 重庆大学 Temperature and humidity prediction and reverse optimization method for small-sized closed space
CN113743652A (en) * 2021-08-06 2021-12-03 广西大学 Sugarcane squeezing process prediction method based on depth feature recognition
CN114169252A (en) * 2021-12-27 2022-03-11 广东工业大学 Short-term region wind power prediction method for dynamically selecting representative wind power plant
CN114202065A (en) * 2022-02-17 2022-03-18 之江实验室 Stream data prediction method and device based on incremental evolution LSTM

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102102626A (en) * 2011-01-30 2011-06-22 华北电力大学 Method for forecasting short-term power in wind power station
CN104933489A (en) * 2015-06-29 2015-09-23 东北电力大学 Wind power real-time high precision prediction method based on adaptive neuro-fuzzy inference system
CN108832663A (en) * 2018-07-18 2018-11-16 北京天诚同创电气有限公司 The prediction technique and equipment of the generated output of micro-capacitance sensor photovoltaic generating system
CN109255726A (en) * 2018-09-07 2019-01-22 中国电建集团华东勘测设计研究院有限公司 A kind of ultra-short term wind power prediction method of Hybrid Intelligent Technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102102626A (en) * 2011-01-30 2011-06-22 华北电力大学 Method for forecasting short-term power in wind power station
CN104933489A (en) * 2015-06-29 2015-09-23 东北电力大学 Wind power real-time high precision prediction method based on adaptive neuro-fuzzy inference system
CN108832663A (en) * 2018-07-18 2018-11-16 北京天诚同创电气有限公司 The prediction technique and equipment of the generated output of micro-capacitance sensor photovoltaic generating system
CN109255726A (en) * 2018-09-07 2019-01-22 中国电建集团华东勘测设计研究院有限公司 A kind of ultra-short term wind power prediction method of Hybrid Intelligent Technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
董伟等: "基于混合智能模型的分布式风力发电预测方法", 《供用电》 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639437A (en) * 2020-06-08 2020-09-08 中国水利水电科学研究院 Method for dynamically changing WRF mode parameterization scheme combination based on ground air pressure distribution situation
CN113404655A (en) * 2020-08-24 2021-09-17 湖南科技大学 Wind driven generator sensor state diagnosis system based on PS0-ANFIS
CN113468794A (en) * 2020-12-29 2021-10-01 重庆大学 Temperature and humidity prediction and reverse optimization method for small-sized closed space
CN113239620A (en) * 2021-05-11 2021-08-10 中国电建集团华东勘测设计研究院有限公司 Improved particle swarm method for parameter identification of geotechnical material constitutive model based on GPU acceleration
CN113743652A (en) * 2021-08-06 2021-12-03 广西大学 Sugarcane squeezing process prediction method based on depth feature recognition
CN113743652B (en) * 2021-08-06 2022-03-11 广西大学 Sugarcane squeezing process prediction method based on depth feature recognition
CN114169252A (en) * 2021-12-27 2022-03-11 广东工业大学 Short-term region wind power prediction method for dynamically selecting representative wind power plant
CN114169252B (en) * 2021-12-27 2022-11-29 广东工业大学 Short-term region wind power prediction method for dynamically selecting representative wind power plant
CN114202065A (en) * 2022-02-17 2022-03-18 之江实验室 Stream data prediction method and device based on incremental evolution LSTM
CN114202065B (en) * 2022-02-17 2022-06-24 之江实验室 Stream data prediction method and device based on incremental evolution LSTM

Similar Documents

Publication Publication Date Title
CN110674965A (en) Multi-time step wind power prediction method based on dynamic feature selection
Wang et al. Adaptive learning hybrid model for solar intensity forecasting
Abraham et al. A neuro-fuzzy approach for modelling electricity demand in Victoria
CN112116144B (en) Regional power distribution network short-term load prediction method
CN113705877B (en) Real-time moon runoff forecasting method based on deep learning model
CN114399032B (en) Method and system for predicting metering error of electric energy meter
CN116526473A (en) Particle swarm optimization LSTM-based electrothermal load prediction method
Do et al. Forecasting Vietnamese stock index: A comparison of hierarchical ANFIS and LSTM
CN113537539B (en) Multi-time-step heat and gas consumption prediction model based on attention mechanism
Lemke et al. Self-organizing data mining for a portfolio trading system
CN113282747A (en) Text classification method based on automatic machine learning algorithm selection
Aliev et al. Fuzzy time series prediction method based on fuzzy recurrent neural network
CN117439053A (en) Method, device and storage medium for predicting electric quantity of Stacking integrated model
CN111563614A (en) Load prediction method based on adaptive neural network and TLBO algorithm
CN115660219A (en) Short-term power load prediction method based on HSNS-BP
Chen et al. A new method for fuzzy forecasting based on two-factors high-order fuzzy-trend logical relationship groups and particle swarm optimization techniques
CN113191526A (en) Short-term wind speed interval multi-objective optimization prediction method and system based on random sensitivity
Cao et al. Research On Regional Traffic Flow Prediction Based On MGCN-WOALSTM
CN112785022A (en) Method and system for excavating electric energy substitution potential
He et al. Application of neural network model based on combination of fuzzy classification and input selection in short term load forecasting
Choudhary et al. Estimation of wind power using different soft computing methods
Lin et al. An efficient evolutionary algorithm for fuzzy inference systems
Kashef Forecasting The Price of the Flight Tickets using A Novel Hybrid Learning model
CN115034426B (en) Rolling load prediction method based on phase space reconstruction and multi-model fusion Stacking integrated learning mode
CN113705932B (en) Short-term load prediction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200110

RJ01 Rejection of invention patent application after publication