CN116226679A

CN116226679A - Wind turbine generator gearbox abnormality detection method considering similarity of running states of multiple units

Info

Publication number: CN116226679A
Application number: CN202211564618.1A
Authority: CN
Inventors: 曾祥军
Original assignee: China Three Gorges University CTGU
Current assignee: China Three Gorges University CTGU
Priority date: 2022-12-07
Filing date: 2022-12-07
Publication date: 2023-06-06

Abstract

The method for detecting the abnormal of the gearbox of the wind turbine by considering the similarity of the running states of multiple wind turbines is based on a piecewise linearization time sequence similarity evaluation method, and single state variables are evaluated; the method comprises the steps of evaluating the similarity of the running states of each wind turbine and a to-be-detected wind turbine by considering a space-time similarity quantification method of the running states of the wind turbines with the similarity of a plurality of state variables; constructing a state estimation LSTM model, and training different LSTM models; verifying the accuracy and adaptability of different LSTM model performances; selecting a plurality of LSTM models with better performance verification, and constructing a combined state estimation model CPEM in a weighted combination mode; and estimating a target variable of the unit to be detected, calculating residual errors of the estimated value and the true value, and carrying out anomaly identification based on residual effective value comparison and residual information entropy comparison. According to the method, the operation state of the gearbox is detected based on the historical operation data of the fan, operation and maintenance personnel are not required to detect on site, and the operation and maintenance cost of the wind turbine generator is reduced.

Description

Wind turbine generator gearbox abnormality detection method considering similarity of running states of multiple units

Technical Field

The invention relates to the technical field of abnormal operation detection of a gearbox of a wind turbine, in particular to a method for detecting abnormal operation of a gearbox of a wind turbine by considering similarity of operation states of multiple wind turbines.

Background

A large wind turbine generator system (hereinafter referred to as a "wind turbine generator") is a complex electromechanical hybrid system composed of a plurality of subsystems, which converts wind energy into electric energy through mutual coupling and synergistic action of the subsystems. Because the wind turbine generator runs in severe environments such as high temperature, high humidity, salt and alkali, wind and sand and the like for a long time, and the wind speed and wind direction are random and the influence of uncertainty of load, the wind turbine generator is more prone to faults compared with the traditional thermal power/hydroelectric generator. The failure rate and the degree of influence after failure of the subsystems are also significantly different due to differences in the operating mechanism, the manufacturing materials and the degree of interference from external inspection. The gearbox serving as the energy transmission core of the wind turbine can have serious influence on the power generation efficiency of the whole turbine once the operation is abnormal, and even cause the turbine to stop for a long time. Therefore, effective measures are taken to reduce the failure rate of the gear box, and the method has important significance for improving the stable operation capacity of the unit and increasing the economic benefit of the wind power plant.

Most wind farms currently employ periodic inspection to detect health status of subsystems, including gearboxes. Since the gearbox of the wind turbine is a closed subsystem, on-site operation staff mostly judge the state of the gearbox through personal experience or simple auxiliary equipment, for example: judging whether the state of equipment in the gear box is abnormal by listening whether obvious abnormal sounds exist in the gear box; the oil pressure and tightness of the inside of the gear box are judged by observing the oil pressure of the gear box and whether oil seepage and oil leakage signs exist at each joint. This approach often only detects a significant symptom of gearbox anomalies or faults, making it difficult to find early anomalies and some weak faults of the gearbox. If a weak fault of the gearbox cannot be found and repaired in time, cascading failures are easily induced, so that huge losses are caused. For example: if not eliminated in time, the early-stage tooth root cracks of the gear box can be gradually deteriorated to serious broken tooth faults, so that cascading faults in the gear box can be induced, and the normal operation of the whole transmission system is critical; if the temperature of the rolling bearing of the gear box is out of limit and cannot be treated in time, serious fire disaster can be caused to damage the whole unit. Therefore, a more reliable and more sensitive gearbox state evaluation method needs to be searched for to timely find the abnormal state of the gearbox, so that effective measures are taken to avoid faults.

With the popularization of sensors and the rising of data mining technologies, a fault diagnosis method of a gearbox of a wind turbine generator based on data driving is gradually paid attention to. Such methods are based on data collected by a data collection and monitoring control system (Supervisory Control And Data Acquisition, SCADA) or a status detection system (Condition mornitoring system, CMS) of the wind turbine, with locally or remotely deployed computers as analysis tools, to evaluate the operational status of the gearbox by mining useful information from the data through machine learning or other intelligent algorithms. Based on the method, the abnormal operation state of the gear box can be found before the state monitoring system of the wind turbine generator triggers the fault alarm, so that more time is provided for operation and maintenance staff to eliminate the fault in the germination stage. Compared with the current common gear box abnormality detection method, the data driving-based method has the advantages of high accuracy, good flexibility and low technical threshold, and has potential of wide popularization and application.

The accuracy of the data-driven wind turbine gearbox abnormality detection method is normally positively related to the amount of historical data used for model training, but the wind turbine runs in a severe environment for a long time, the number of historical data which can be obtained is limited by faults of sensors, abnormality of signal transmission and periodic updating of a memory, and enough data is difficult to ensure for model training. Meanwhile, many anomaly detection methods default to historical data used for model training to be data monitored during normal operation of the wind turbine, but an actual wind turbine may be in an abnormal operation state already earlier, so that the monitored historical data may be abnormal. When the history data for model training is insufficient or the history data itself of the training model is abnormal, the reliability of the abnormality detection result is inevitably affected. The method has important significance on how to solve the problems of insufficient historical data and insufficient accuracy and reliability of the abnormal detection result of the gearbox under the abnormal condition of training data.

Disclosure of Invention

In order to solve the technical problems, the invention provides a wind turbine gearbox abnormality detection method considering the similarity of the running states of multiple units. Furthermore, a basis is provided for operation and maintenance personnel to make an overhaul and maintenance plan, and the operation and maintenance cost of the wind turbine generator can be obviously reduced.

The technical scheme adopted by the invention is as follows:

a wind turbine generator system gearbox anomaly detection method considering the similarity of the running states of multiple wind turbine generator systems comprises the following steps:

step 1: evaluating a single state variable based on a piecewise linearization time sequence similarity evaluation method;

step 2: the method comprises the steps of considering a space-time similarity quantification method of the running states of wind turbines with similarity of a plurality of state variables, evaluating the similarity of the running states of each wind turbine and the to-be-detected wind turbine in the same wind power plant one by one, and selecting data of a plurality of wind turbines with stronger running states and the to-be-detected wind turbine;

step 3: preprocessing the historical data of the unit to be detected and the historical data of the selected unit to be detected in the step 2, and dividing the historical data into the following steps according to time periods: training data, test data, and detection data;

Step 4: constructing a state estimation LSTM model, and training a plurality of different state estimation LSTM models by using training data of the selected similar units and training data of the units to be detected;

step 5: setting comprehensive evaluation indexes, and verifying the accuracy and adaptability of the LSTM model performance estimated by different states by using test data;

step 6: selecting a plurality of state estimation LSTM models with better performance in the verification in the step 5, and constructing a combined state estimation model (Combination State Estimation Model, CPEM) in a weighted combination mode;

step 7: and (3) estimating a target variable of the unit to be detected based on the combined state estimation model in the step (6), calculating residual errors of the estimated value and the true value, and carrying out anomaly identification based on residual effective value comparison and residual information entropy comparison.

The step 1 comprises the following steps:

s1.1, a state variable time sequence curve provided with two wind turbines is L1 and L2 respectively, and the state variable time sequence curves comprise L sampling points;

s1.2, the long time sequence is divided into different short time sequences by adopting a linearization division method, and the state variable time sequences L1 and L2 after linearization division can be approximately used as h (h >2) Segment sum k (k)>2) The broken line of the section represents that the breaking point is taken as the abrupt point of the numerical values on the two curves; the broken line after division has three kinds of change trends which only rise, fall and remain unchanged, and the three kinds of change trends are respectively represented by 1, -1 and 0, so that the state variable time sequence curve can be approximated by the data set S ₁ And S is ₂ The representation is:

in the formula (1), t represents a time corresponding to each division point; m represents the broken line variation trend of the segmentation, and the value set is {1, -1,0}; delta y is the difference between the end normalized value and the head normalized value of each folding line, which reflects the amplitude variation of the folding line, the superscripts s1 and s2 are used for distinguishing different folding lines, and the subscripts k and h respectively identify the kth segment and the h segment of the folding line;

elements in a collection

And->

The h-th linear polyline of the curve L1 and the k-th linear polyline of the curve L2 are shown, respectively. The time-structured sets corresponding to their linearization division points are respectively denoted as +.>

And->

Wherein: -a->

Representing the moment of time-series polyline L1 corresponding to the h-th segmentation point,/and>

the time-series broken line L2 is shown at the time point corresponding to the kth segment point.

S1.3, performing secondary linearization segmentation on the basis of the primary linearization segmentation of S1.2 to ensure that linearization segmentation points of the curves L1 and L2 are identical; the second time of dividing the point set T is implemented by the method of dividing the point set T into two parts ^s1 And T ^s2 After the union of the numbers, each division point is divided into the time sequence is arranged in sequence to obtain the product, the expression is as follows:

T＝sort(T ^s1 ∪T ^s2 )＝{t ₀ ,t ₁ ,t ₂ ,…t _l-1 ,t _l },(l≥h,l≥k) (2)；

in the formula (2): sort () is the order calculation from small to large of the numerical value; the symbol U represents the union of the sets, L represents the number of division points contained in each curve after the second division, h and k represent the number of division points contained in the curves L1 and L2 after the first linearization division, t _l Indicating the moment corresponding to the first dividing point; linearization segmentation set of L1 and L2 of original time sequence curve after secondary linearization segmentation

And->

Expressed as:

in the formula (3): t is t ₀ ,t ₁ ，…,t _l-1 ,t _l Respectively represent the time instants corresponding to the different segmentation points,

the representation curve L1 is divided by t after the second linearization _l-1 And t _l The transformation trend of the two-point determined line segment is +.>

The amplitude trend is

The representation curve L2 is divided by t after the second linearization _l-1 And t _l The transformation trend of the two-point determined line segment is +.>

Amplitude trend is +.>

At this time, the state variable time series curve is divided into a plurality of polyline segments which are equal in number and identical in division point.

S1.4 degree of similarity S of the state variables according to equation (4) _ST And (3) performing calculation:

in the formula (4): s is S _ST The value range of (2) is [0,1 ]]The larger the value thereof, the greater the degree of similarity of the two state variables.

L represents the total number of linear segments, i is the iteration count identification,

shows the variation trend of the ith line segment after the linear segmentation of the curve L1,/for the curve>

Shows the change trend of the ith line segment after the linear segmentation of the curve L2,/for the curve>

Representing the variation amplitude of the ith line segment after the linear segmentation of the curve L1,/for>

The change amplitude of the ith line segment after the linear segmentation of the curve L2 is represented.

Then represent the trend distance of the corresponding line segment after linearization segmentation, +.>

The amplitude distance of the corresponding line segment after linearization segmentation is represented.

The step 2 comprises the following steps:

s2.1, selecting a plurality of state variables capable of comprehensively reflecting the similarity of the macroscopic state and the microscopic state of the wind turbine generator, and calculating the similarity of each state variable according to a formula (5);

in formula (5): s is S _a Is the area of the hatched portion,

for the similarity of the ith state variable, S ₀ Is the area of a regular polygon, n is the number of selected state variables, +.>

Similarity for State variable 1, +.>

Ith (i)Similarity of +1 state variables +.>

Is the similarity of the n+1th state variable.

S2.2, quantifying the space-time similarity of the running state of the wind turbine generator based on a radar chart area comparison method;

s2.2.1 firstly, constructing a regular polygon radar chart with the same number of the state variables as that selected in S2.1, defining that each axis of the radar chart represents one state variable, and setting the reference length of the axis to be 1;

S2.2.2 drawing an arrow outwards along the axis from the central position of the regular polygon radar chart as a starting point, wherein the length of the arrow represents the similarity degree of the corresponding state variables of the two wind turbine generator systems;

s2.2.3 finally, all arrow end points are sequentially connected to form a closed graph, namely a radar graph shadow part, and the proportion of the area of the closed graph to the area of the radar graph represents the space-time similarity of the running states of the two wind turbine generator systems.

The step 4 comprises the following steps:

s4.1, adopting LSTM to construct a state estimation model of the target variable:

the LSTM internally comprises a forgetting door f _t Input gate i _t And an output gate O _t The method comprises the steps of carrying out a first treatment on the surface of the Wherein forget door f _t Determining which new units need to be forgotten to transfer the hidden state information from the last unit; input gate i _t For deciding which information the current cell needs to store; output door O _t For deciding which information the current hidden state needs to output to the next unit; the status update procedure is expressed as:

in formula (6): w (W) _f And b _f Respectively the weight and the bias term of the forgetting gate; w (W) _i And b _i The weight and the bias items of the input gate are respectively;W _o and b _o The weight and the bias term of the output gate are respectively; w (W) _c And b _c Respectively a weight and a bias item of the current state; g (,) represents a gate function; as indicated by the multiplication of the corresponding position elements of the matrix, h _t-1 An implicit layer variable representing the time t-1, which has a short-term memory function, C _t-1 And C _t Cell layer variables respectively representing time t-1 and time t, which have long-term memory function, X _t The input variable at time t is indicated.

S4.2, constructing a state estimation model of the wind turbine by adopting a two-layer LSTM layer stacking mode; after training a plurality of state estimation LSTM models by using SCADA data of the unit to be detected and other units with similar running states of the unit to be detected, the state estimation LSTM model with relatively good performance is used for constructing a combined state estimation model CSEM. In the step 5, the performance test of the LSTM model for different state estimation by using the test data is divided into two steps of verification: the method comprises the steps of firstly, testing the accuracy of a corresponding state estimation LSTM model estimation result based on test data of each wind turbine generator; the second step is to test the adaptation capability of the corresponding state estimation LSTM model to the data of the unit to be detected by using the test data of the unit to be detected;

in the step 5, the performance evaluation criteria of all the state estimation LSTM models are measured by a comprehensive score index β, and the calculation formula is as follows:

in the formula (7): y is _i And

Respectively verifying an ith true value and an estimated value of a target variable in the dataset; n is the total number of samples in the verification dataset; />

To verify the mean of the dataset target variables; e (E) _rmse And E is _mae Respectively representing root mean square and absolute average values of the residual errors, wherein smaller values of the root mean square and the absolute average values indicate that the model is more accurate in estimation; r is R ² As a decision coefficient, it reflects the goodness of fit of the model to the data, its value ranges from [0,1]The larger its value is the more interpretable the model is to the data; the composite score index beta and E are known according to the formula (7) _rmse 、E _mae And R is ² The three indices are closely related, and the larger the value of the index is, the better the comprehensive performance of the model is.

In the step 6, the combined state estimation model CSEM is constructed by adopting a weighted combination mode, and the specific method is as shown in the formula (8):

in formula (8): m is M _i Representing an ith LSTM sub-model; m is the number of selected LSTM submodels; m is M _CSEM Represents CSEM; gamma ray _i And (5) calculating a similarity quantification value of the wind turbine based on the formula (5). The combination weight is related to the space-time similarity of the running states of the wind turbines, namely, the more similar the running states of the wind turbines in the same wind power plant are to-be-detected, the greater the contribution of a model trained based on the data of the wind turbines to CSEM is.

M ₀ Representing LSTM model trained based on historical data of unit to be detected and corresponding gamma ₀ =1, in case of insufficient training data or containing a large number of outliers, it is difficult to obtain reliable M ₀ When gamma is to be calculated ₀ Is set to 0, representing M ₀ The method does not participate in the construction of CSEM, and realizes the abnormal identification of the unit to be detected only by the wind turbine generator data with similar running states.

In the step 7, two indexes are used to determine whether the residual error is abnormal: the first index is the sample residual effective value R _m For measuring the overall deviation degree of the sample residual error, R _m The larger the value, the higher the probability of detecting the sample anomaly; another index is residual information entropy E _n To evaluate the uncertainty of the occurrence of anomalies in the sample. E (E) _n The smaller the value, the higher the likelihood that the detected sample is abnormal.

The step 7 comprises the following steps:

s7.1: through sliding window sampling, dividing a long-time sequence of a target variable of a unit to be detected into a plurality of short-time sequences, and performing anomaly detection, wherein R of each short-time sequence _m And E is _n The value is calculated according to equation (9):

in the formula (9): i is the number of statistical intervals divided based on residual distribution; d, d _i A number of residual samples that fall within an ith statistical interval; p is p _i The proportion of the corresponding statistical interval sample to the detection sample is calculated; alpha determines the unit of measure of the information entropy, and usually takes the value of alpha as natural logarithm 'e'; t (T) _inv Is the sampling interval of the SCADA system; t (T) _r Is the window width employed by the sliding window, which reflects the time scale of detection.

S7.2: three statistical intervals are set: the normal area, the risk area and the high risk area are used for calculating the information entropy of each detection period, the three statistical intervals are not continuous, and the boundary values are four important parameters Q based on a box-type diagram _low ,Q ₁ ,Q ₃ And Q _up Determined, wherein: q (Q) ₁ And Q ₃ 0.25 quantiles and 0.75 quantiles of the box plot respectively; q (Q) _up And Q _low The upper and lower boundaries in the box plot, respectively, whose values are calculated by equation (10).

When the residual value is greater than Q ₁ And is smaller than Q ₃ When the residual error falls in the normal region; when the residual value is smaller than Q _low Or is greater than Q _up When the residual error falls in the high risk area; the residual values fall in the risk area, and the residual error of the high risk area deviates significantly from the overall normal distribution rangeEnclosing;

s7.3: firstly, respectively calculating an effective value and an information entropy of a residual error sample in a detection period; second, the effective value of the residual error of the detection period is not more than the threshold H _rm If the threshold value is not exceeded, the residual distribution is in a normal range, if the effective value is exceeded, the residual of the sample is larger, and the abnormal condition is caused by individual extremely large noise or caused by a large number of residual abnormalities, which is further judged by the information entropy of the residual; if the entropy of the residual is also less than a given threshold H _en The large probability of the larger effective value is caused by abnormal operation of the detection object, otherwise, the effective value is caused by noise interference.

The invention relates to a wind turbine gearbox anomaly detection method considering the similarity of the running states of multiple units, which has the following technical effects:

1) According to the method, the SCADA data of other wind turbines in the same wind power plant are fully utilized to improve the accuracy of detecting the abnormality of the gearbox of the target wind turbine.

2) According to the invention, the detection of the abnormal state of the gearbox of the target unit is realized based on the historical SCADA data of the wind turbine unit, and the detection of operation and maintenance personnel on site is not required. According to the method, abnormal identification of the gearbox oil temperature and the internal bearing temperature of the wind turbine can be performed, even if abnormal states possibly existing in the gearbox of the wind turbine are found, the basis is provided for operation and maintenance staff to make overhaul and maintenance plans, and the operation and maintenance cost of the wind turbine can be remarkably reduced. The method is particularly suitable for abnormal overhaul of remote areas and offshore wind turbines.

3) The method fully utilizes the data of all the similar wind turbines of the target wind turbine in the same wind power plant, expands the scale of available data, and can effectively solve the problem that the state estimation model cannot be accurately trained due to insufficient historical data of the to-be-detected wind turbines or abnormality of the to-be-detected wind turbines in actual engineering.

4) The invention provides a state variable similarity comparison method based on piecewise linearization, which can more accurately quantify the difference between the same variables of different wind turbines by considering the trend change and the amplitude change average value of all time sequence fragments after piecewise linearization, and overcomes the defect that the prior state variable similarity evaluation method only considers the numerical characteristic similarity and ignores the time synchronism

5) The invention provides a method for quantifying the space-time similarity of the running states of a wind turbine, which comprehensively considers the similarity of the macroscopic state and the microscopic state of the wind turbine. 6) In order to identify the abnormal state of the target variable, the invention provides an abnormal data identification method based on the target variable residual effective value comparison and residual information comparison, which can reliably distinguish data abnormality caused by accidental factors such as actual operation abnormality of a gear box, noise and the like, thereby improving the reliability of abnormality detection.

Drawings

The invention is further illustrated by the following examples in conjunction with the accompanying drawings:

FIG. 1 is a flow chart of anomaly detection in accordance with the present invention;

FIG. 2 is a time series plot of state variables for a wind turbine.

FIG. 3 is a schematic diagram of the space-time similarity quantification of the operational state of a unit based on the radar map area comparison method.

FIG. 4 (a) is a schematic diagram of the basic structure of RNN;

fig. 4 (b) is a schematic diagram of the basic structure of LSTM.

Fig. 5 is a schematic diagram of stacked LSTM model structure.

FIG. 6 is a flow chart of LSTM performance testing and CSEM construction.

FIG. 7 (a) is a partitioning original path diagram of a statistical interval;

fig. 7 (b) is a graph of the range of values of the information entropy;

fig. 7 (c) is an abnormality recognition flowchart.

FIG. 8 is a statistical graph of the importance coefficients of the main variables and the temperature of the front bearing of the gearbox based on a recursive feature elimination method.

FIG. 9 (a) is a graph showing the result of quantifying the space-time similarity between each unit in the wind farm WF1 and the operation state of WT 0;

fig. 9 (b) is a radar chart for quantifying the similarity between the wind farm WT0 and a part of the wind turbine.

Fig. 10 is a graph comparing CSEM estimation results based on different schemes in WF 1.

FIG. 11 (a) is a graph (Sch 1) of the detection results of anomalies based on different schemes in WF 1;

FIG. 11 (b) is a graph (Sch 2) of the detection results of abnormalities based on different protocols in WF 1;

Fig. 11 (c) is a graph (Sch 3) of the abnormality detection result based on the different schemes in WF 1.

FIG. 12 (a) is a graph showing the results of the quantitative analysis of the space-time similarity between each unit of WF2 and the operation state of WT 0;

FIG. 12 (b) is a diagram of a quantitative radar of the similarity between each unit of WF2 and WT 0.

FIG. 13 (a) is a graph (Sch 1) of anomaly detection results based on different schemes in WF 2;

FIG. 13 (b) is a graph (Sch 2) of anomaly detection results based on different schemes in WF 2;

fig. 13 (c) is a graph (Sch 3) of abnormality detection results based on different schemes in WF 2.

FIG. 14 (a) is a graph of performance test results based on WF0 validation dataset in WF 1;

fig. 14 (b) is a graph of performance test results based on WF0 verification data set in WF 2.

FIG. 15 (a) is a graph showing the results of abnormality detection based on WF0 detection dataset in WF 1;

fig. 15 (b) is a graph of abnormality detection results based on WF0 detection data set in WF 2.

FIG. 16 (a) is a graph of anomaly detection results (estimate versus true) for different methods in the case of training data anomalies; fig. 16 b is a diagram of abnormality detection results (identified abnormalities) of different methods in the case of abnormality of training data.

Detailed Description

A wind turbine gearbox anomaly detection method considering the similarity of the running states of multiple units firstly provides a time sequence similarity quantification method based on piecewise linearization. And then designing a method for quantifying the space-time similarity of the running states of the wind turbines by comprehensively considering the similarity of macroscopic variables and microscopic variables of different wind turbines, wherein the method can quantify the running state similarity of the target turbine and other turbines in the same wind power plant. Secondly, selecting historical data of a plurality of units with strong operation similarity with the target wind turbine unit to respectively train a plurality of different long-short-term memory network models, verifying the accuracy of each model and the adaptability to the target unit data through test data, then selecting the long-short-term memory network model with good accuracy and adaptability to construct a combined state estimation model (Combination State Estimation Model, CPEM), and reliably estimating the target variable of the unit to be detected by using the model. Finally, in order to verify whether the target variable data is abnormal, an abnormal data identification method based on the combination of the residual effective value and residual information entropy is also provided.

FIG. 1 is a flowchart of the anomaly detection method of the present invention, which includes three modules, namely, variable selection and wind turbine running state similarity evaluation, different LSTM model training and performance evaluation, combined model construction, and anomaly detection, and can be specifically implemented in 9 steps:

step one: based on professional knowledge, selecting a plurality of state variables for the space-time similarity evaluation of the running states of the wind turbine, and combining a feature selection algorithm on the basis: the recursive characteristic elimination method selects the variable which can reflect the running state of the gear box as the target variable (the output variable of the model) and selects the variable closely related to the target variable as the input variable of the model.

Step two: considering the characteristics of state variables monitored by a SCADA system of a wind turbine generator and the defects of the current common method, a piecewise linearization-based time sequence similarity evaluation method is provided and used for evaluating single state variables, and the method specifically comprises the following steps:

the basis of the similarity evaluation of the running states of the wind turbine generator is the similarity of key state variables, so that the similarity of single state variables needs to be evaluated objectively and accurately. According to the data characteristics acquired by the SCADA system, all state variables of the wind turbine are stored in a time sequence mode, so that the acquired data have numerical characteristics and time attributes. However, most of the existing similarity evaluation methods for single state variables use numerical statistical features or distribution features to perform similarity comparison, and although the methods can use numerical characteristics to reflect the similarity of corresponding state variables in space, the method ignores the time synchronism of the state variables. Therefore, the invention firstly provides a wind turbine generator set state variable similarity evaluation method based on piecewise linearization comparison.

Assume that the state variable time sequence curves of two wind turbines are respectively L1 and L2, and each of the state variable time sequence curves comprises L sampling points, and the time sequence curves are shown as blue and green dotted lines in fig. 2. In order to quantify their similarity, a linearization segmentation method is first used to segment a long time sequence into different short time sequences, and the L1 and L2 after linearization segmentation can be approximated by h (h>2) Segment sum k (k)>2) The broken line of the segments indicates that the break point can be taken as a numerical break point on the two curves, as shown by the solid red and yellow lines in fig. 2. As can be seen from FIG. 2, the broken lines after the segmentation have three kinds of variation trends, namely rising, falling and keeping unchanged, if the three kinds of variation trends are respectively represented by 1, -1 and 0, the original curve can be approximated by the data set S ₁ And S is ₂ The representation is:

in the formula (1): t represents the corresponding time of each division point; m represents the broken line variation trend of the segmentation, and the value set is {1, -1,0}; delta y is the difference between the end normalized value and the head normalized value of each polyline, which reflects the change in polyline amplitude; the superscripts s1 and s2 are used to distinguish different fold lines; subscripts k and h identify the kth and h fold segments, respectively. Then the elements in the collection

And->

H segment linearization folds respectively representing curves L1The kth segment of line and curve L2 linearizes the polyline. Their time-structured sets corresponding to the linearization partition points can be respectively written as

And->

Since the first linearized segmentation point is based on the characteristic division of the change of each curve itself, two linearized data sets S ₁ And S is ₂ The division points of the curves L1 and L2 are not guaranteed to be identical, which is disadvantageous for comparison of time series, and therefore a second linearization division is required on the basis of the first linearization division to ensure that the linearization division points of the curves L1 and L2 are identical. The second time of dividing the point set T is implemented by the method of dividing the point set T into two parts ^s1 And T ^s2 After the union of the numbers, each division point is divided into the time sequence is arranged in sequence to obtain the product, the expression is as follows:

in the formula (2): sort () is a numerical order calculation from small to large, and the symbol u represents the union of the sets. Linearization segmentation set of L1 and L2 of original curve after secondary segmentation

And->

Can be expressed as:

at this time, the original time series is divided into a plurality of polyline segments which are equal in number and identical in division point. The degree of similarity S of the variables can be further determined according to equation (4) _ST And (3) performing calculation:

in the formula (4): s is S _ST The value range of (2) is [0,1 ] ]The larger the value thereof, the greater the degree of similarity of the two variables. The state variable similarity evaluation method provided by the invention can be found to conduct differentiation comparison on the comparison of the original long-time sequence in a piecewise linearization mode, and the overall similarity of the state variables of the wind turbine generator is evaluated through the average value of the trend changes and the amplitude changes of all linearization fold lines. The similarity of the state variables is evaluated based on the method, the defect that the time synchronism cannot be considered when the statistical characteristics and the space distance are adopted for comparison can be avoided, and therefore the similarity comparison result of the state variables of the wind turbine generator obtained by the method is more reliable.

Step three: based on the second method, a method for quantifying space-time similarity of the running states of wind turbine generators by comprehensively considering similarity of a plurality of state variables is provided, the data of a plurality of sets with stronger running states and the sets to be detected are selected to expand the data scale for anomaly detection by evaluating the similarity of each set and the running states of the sets to be detected in the same wind power plant one by one, and the method comprises the following steps:

in order to accurately measure the space-time similarity of the running states of the wind turbine, the similarity of a plurality of state variables is required to be comprehensively considered, and judgment can not be performed by judging whether only a single variable is similar, so that the invention further provides a method for quantifying the space-time similarity of the running states of the wind turbine, which comprehensively considers the similarity of the macroscopic state and the microscopic state. Macroscopic state similarity refers to similarity of the running environment and energy conversion efficiency of the wind turbine generator; microscopic variables refer to the similarity of the operating conditions of a particular test object or subsystem, such as a gearbox, generator, etc. The selection of variables reflecting macroscopic state similarity mostly depends on expertise, such as common wind speed, main bearing rotation speed, output power, etc., while the selection of variables reflecting microscopic state similarity can be combined with the result of correlation analysis of expertise and data to select variables capable of directly or indirectly reflecting the running state of the monitored object. The microscopic variables required for a test object, which are not usually understood, are distinguished.

Selecting a plurality of variables capable of comprehensively reflecting the similarity of the macroscopic state and the microscopic state of the wind turbine generator, and calculating the similarity of the state variables according to the formula (5); on the basis, a method based on a radar chart area comparison method is designed to quantify the space-time similarity of the running states of the wind turbine generator, and a specific quantification principle is shown in fig. 3. Firstly, constructing a regular polygon radar chart with the same number as the selected state variables, defining that each axis of the radar chart represents one state variable, and setting 1 of the reference length of the axis; secondly, blue arrows are drawn outwards along the axis from the center of the polygon, and the lengths of the arrows represent the similarity of the corresponding state variables of the two wind turbine groups. Finally, all blue arrow end points are sequentially connected to form a closed graph, namely a shade part of the radar graph in fig. 3, and the proportion of the area of the closed graph to the area of the radar graph represents the space-time similarity of the running states of the two wind turbine generator systems. The corresponding similarity quantization formula is:

in formula (5): s is S _a Is the area of the hatched portion,

for the similarity of the ith state variable, S ₀ Is the area of the regular polygon and n is the number of selected state variables. The method comprehensively considers the similarity of a plurality of state variables of the wind turbine generator in time and space, and can obtain a space-time similarity quantized value of a relatively reliable running state.

Step four: preprocessing the historical data of the unit to be detected and the historical data of the unit to be selected, keeping the synchronism of the data of each unit and the data of the unit to be detected in sampling time, and dividing the data into: training data, test data, and detection data;

step five: based on a long-short-term memory network (Long Short Term Memory, LSTM) algorithm, training a plurality of different LSTM models by respectively utilizing training data of the selected similar units and self-training data of the unit to be detected, and if the historical data of the unit to be detected is insufficient, training the models by utilizing the historical data of the unit to be detected can be omitted;

step six: setting comprehensive evaluation indexes, and verifying the performances of different LSTM models in two steps by using test data: firstly, verifying the accuracy of an LSTM model obtained by using the training data of each unit by using the self-test data of each unit, and secondly, verifying the adaptability of the model to the data of the unit to be detected by using the test data of the unit to be detected;

step seven: several LSTM models with better performance of both performance tests are selected, and a combined state estimation model (Combination State Estimation Model, CPEM) is constructed in a weighted combination mode.

The technical scheme of the fifth to seventh steps is as follows:

1) State estimation LSTM model:

in order to better mine short-time dependence of variables and condition dependence relations among the variables, an LSTM is adopted to construct a state estimation model of a target variable. LSTM is a deep learning algorithm which is improved on the basis of a cyclic neural network (Recurrent Neural Network, RNN), has the characteristics of memory characteristics and parameter sharing of the RNN, and effectively solves the problems of gradient disappearance and gradient explosion which possibly occur during RNN training by adding an internal control gate structure, so that the method has more advantages when learning the characteristics containing time attributes.

FIGS. 4 (a), 4 (b) illustrate basic structural elements of RNNs and LSTMs, where the symbol C represents the state of the element, h represents the hidden state, X represents the input information, and subscripts t and t-1 are used to distinguish the current element from the previous element information. As can be seen from fig. 4 (a) and 4 (b): the basic unit of RNN contains only one single nerve layer, and there is only one unit-to-unitA state stream C is in transit. However, the state flows C and h are transmitted between the units of the LSTM, and the LSTM includes four interrelated neural network layers, and the added three layers are essentially 3 special control gate structures, which are respectively: forgetting door f _t Input gate i _t And an output gate O _t . The forget gate determines which units need to be forgotten to be transmitted from the last unit to hide the state information; the input gate is used for deciding which information the current cell needs to store, and the output gate is used for deciding which information the current hidden state needs to output to the next cell. The above-mentioned state update process inside the control gate can be expressed as:

in formula (6): w (W) _f And b _f Weight and bias terms, W, of forgetting gates, respectively _i And b _i The weight and bias items of the input gate are respectively W _o And b _o The weight and bias terms of the output gate, W _c And b _c The weight and bias terms for the current state, g (,) represent the gate function, and c represent the multiplication of the corresponding position elements of the matrix.

Besides short-time dependency, the state variables of the wind turbine generator set also have obvious nonlinear characteristics, and in order to learn the nonlinear time sequence characteristics better, a state estimation model of the target variables is constructed by stacking LSTM layers. The depth of the model can be improved through stacking the LSTM, so that the nonlinear representation capability of the model is effectively improved, but the calculation and memory consumption are increased, and for this purpose, the state estimation model of the wind turbine generator is constructed by adopting a two-layer stacking mode, so that learning capability and learning efficiency are both considered, and the structure of the stacking model is shown in figure 5. After training a plurality of LSTM models by using SCADA data of the wind turbine to be detected and other turbines with similar running states, LSTM with relatively good performance is used for constructing a final CSEM.

2) Evaluation of LSTM Performance and construction of CSEM:

LSTM that meets both accuracy and adaptability requirements can be used to construct the final CSEM, the flow of which is shown in fig. 6. The performance test of the different LSTM models can be seen to be divided into two steps, the accuracy of the estimation results of the corresponding LSTM models is tested based on the verification data set of each unit, and then the adaptability of the models to the unit data to be detected is tested by using the unit verification data set to be detected. The performance evaluation criterion of all models is measured by a comprehensive score index beta, and the calculation formula is as follows:

in the formula (7): y is _i And

the ith true value and the estimated value of the target variable in the verification data set are respectively, N is the total number of samples in the verification data set, and +.>

To verify the mean value of the target variables of the dataset, E _rmse And E is _mae Respectively representing root mean square and absolute average values of the residual errors, wherein smaller values of the root mean square and the absolute average values indicate that the model is more accurate in estimation; r is R ² As a decision coefficient, it reflects the goodness of fit of the model to the data, its value ranges from [0,1]The larger its value is illustrative of the greater the interpretability of the model for the data. From the formula (7), the comprehensive score index beta is closely related to the three indexes, and the larger the value of the comprehensive score index beta is, the better the comprehensive performance of the model is. Therefore, the LSTM with relatively better performance can be selected according to the scoring condition of each LSTM model in the accuracy test and the adaptability test to construct the final CSEM.

The CSEM is constructed by adopting a weighted combination mode, and the specific method is shown as a formula (8):

in formula (8): m is M _i Represents the ith LSTM submodel, M is the number of submodels selected, M _CSEM Representing CSEM, gamma _i And (5) calculating a similarity quantification value of the wind turbine based on the formula (5). The method can find that the combination weight of the method is related to the space-time similarity of the running states of the wind turbine generators, namely, the more similar the running states of the wind turbine generators in the same wind power plant are to be detected, the greater the contribution of a model trained based on the data of the wind turbine generators to CSEM is.

It should be noted that: m is M ₀ Representing LSTM model trained based on historical data of unit to be detected and corresponding gamma ₀ =1, in case of insufficient training data or containing a large number of outliers, it is difficult to obtain reliable M ₀ When it is, gamma can be ₀ Is set to 0, representing M ₀ The method does not participate in the construction of CSEM, and realizes the abnormal identification of the unit to be detected only by the wind turbine generator data with similar running states.

Step eight: and estimating the target variable of the unit to be detected based on CPEM, and calculating the residual error of the estimated value and the true value.

Step nine: in order to eliminate noise and interference of accidental factors and accurately identify abnormal residual errors, an abnormal data detection method based on residual error effective value comparison and residual error information entropy comparison is further provided, and the abnormal residual errors caused by abnormal operation of equipment can be effectively identified based on the method.

Anomalies in the target variable can cause the residual values of the true value and the estimated value to change greatly, and change the information entropy of the residual sequence. Therefore, the invention provides an anomaly detection method combining residual effective value comparison and residual information entropy comparison. The method uses two indexes to judge whether the residual error is abnormal, wherein the first index is the effective value R of the sample residual error _m For measuring the overall deviation degree of the sample residual error, R _m The larger the value, the higher the probability of detecting a sample anomaly. Another index is the information entropy E of the residual _n It can be used to evaluate the degree of uncertainty of the occurrence of anomalies in the sample. E (E) _n The smaller the value, the more the test isThe higher the likelihood that the test sample is abnormal. In order to improve detection precision, the invention divides a long time sequence of a target variable into a plurality of short time period sequences for abnormality detection through sliding window sampling, R of each short sequence _m And E is _n The value can be calculated according to equation (9):

in the formula (9): i is the number of statistical intervals divided based on residual distribution, d _i For the number of residual samples falling within the ith statistical interval, p _i For the proportion of the corresponding statistics interval sample to the detection sample, alpha determines the measurement unit of the information entropy, and the value of alpha is usually taken as natural logarithm 'e', T _inv Is the sampling interval, T, of the SCADA system _r Is the window width employed by the sliding window, which reflects the time scale of detection.

The invention sets three statistical intervals to calculate the information entropy of each detection period, and the statistical intervals are determined by box graphs of all target variable residuals obtained based on verification data sets. Fig. 7 (a) marks the three statistical intervals with different colors, respectively, a normal region, a risk region, and a high risk region. It can be found that these three statistical intervals are not continuous, and their boundary values are based on four important parameters of box diagram, Q _low ,Q ₁ ,Q ₃ And Q _up Determined, where Q ₁ And Q ₃ 0.25 quantile and 0.75 quantile, Q of box plot respectively _up And Q _low The upper and lower boundaries in the box plot, respectively, whose values can be calculated by equation (10). When the value of the residual is greater than Q ₁ And is smaller than Q ₃ When the residual error falls in the normal region; when the residual value is smaller than Q _low Or is greater than Q _up When the residual error falls in the high risk area; the remaining values fall in the risk areas, and the residual errors of the high risk areas deviate from the overall normal distribution range obviously. The effective value of the sample does not deviate from the whole obviously when a large amount of the sample falls in the normal area and the risk area, and a higher proportion of the sample falls in the high risk area and or The presence of individual significant outliers can cause significant deviations in the sample significance from unity.

FIG. 7 (b) shows the proportion p of the detected sample residuals falling within these three statistical regions _i (i=1, 2, 3) and the corresponding detected sample residual information entropy, it is not difficult to see that when a larger proportion of samples fall in the same section, the information entropy of the sample residual is small, which indicates that the probability of occurrence of a random event is large, according to the relation shown in fig. 7 (b).

Based on the above reasoning, an abnormality detection flow as shown in fig. 7 (c) is proposed, first, the effective value and the information entropy of the detection period residual samples are calculated respectively; second, the effective value of the residual error of the detection period is not more than the threshold H _rm If the threshold value is not exceeded, the residual distribution is in a normal range, if the effective value is exceeded, the residual is larger, the information entropy of the residual is needed to be further used for judging whether the abnormality is caused by the individual largest noise or the abnormality caused by a large number of residual errors, and if the information entropy of the residual is also smaller than a given threshold value H _en The large probability of the larger effective value is caused by abnormal operation of the detection object, otherwise, the effective value is caused by noise interference.

The method provided in this section adopts a simple and effective method for setting H _rm And H _en Firstly, calculating residual errors when a target variable normally operates by using a verification data set of a unit to be detected; then, the residual sequence is divided into different detection segments by sliding window sampling, and finally, R of each segment is calculated _m And E is _n Taking all detection fragments R _m And E is _n Respectively as H _rm And H _en Is a value of (2).

Step ten: and (3) verifying an example:

the data in the calculation example are derived from two land wind farms in the middle of A province and B province, and 33 units SCADA data which are the same in model and are put into production in the same period are selected from each wind farm for research. And marking the wind turbine to be detected as a No. 0 turbine, and marking other turbines according to No. 1-32 in sequence. The corresponding data and fault information of the two wind power plants are shown in table 1, and it can be seen that the effective data amount of each unit provided by the two wind power plants has obvious difference. In the research process, data are strictly screened to ensure the synchronization of all data, and particularly, the data of one wind turbine generator set in the same wind power plant are deleted, so that the data of other wind turbine generator sets at the moment are also deleted. The deleted data comprise the data of the wind speed lower than the cut-in wind speed and the wind speed higher than the cut-out wind speed, and the data of the fault shutdown and maintenance period of each wind turbine generator. Further, the abnormality detection time scale herein was set to 1 hour.

TABLE 1 SCADA data information and data set partitioning for wind farms

(1) Detection of abnormal temperature of a front bearing of a gearbox based on WF1 wind farm data:

the total training data amount of each unit from 2018/11/1 to 2019/4/19 is 246240, and the data which can be used for similarity evaluation and model training after screening is 224273. In addition, 8 state variables are selected in the calculation example to evaluate the space-time similarity of the running states of different wind turbines, wherein the space-time similarity comprises three macroscopic variables and five microscopic variables. Macroscopic variables include wind speed, spindle rotational speed and output power, which can reflect the energy conversion efficiency of a wind turbine from wind energy to mechanical energy, and from mechanical energy to electrical energy. Five micro variables include the front bearing temperature of the gearbox as the detection target variable and four variables closely related to the temperature, and a recursive feature elimination algorithm is adopted in the calculation example to evaluate the importance coefficient of each variable to the front bearing temperature of the gearbox. Fig. 8 shows the importance coefficients of the main variables and the gearbox front bearing temperature based on the recursive feature elimination method, and the gearbox oil temperature, the gearbox rear bearing temperature, the generator front bearing temperature and the nacelle temperature are selected to be the remaining four microscopic variables according to fig. 8.

If the state variables for comparing the space-time similarities of the running states of the wind turbines are determined, the proposed method can quantify the space-time similarities of the running states of each turbine in the wind farm WF1 and WT0, and the quantification result is shown in FIG. 9 (a). In the example, five-table groups of WT6, WT11, WT23, WT17, and WT32, which are arranged in the first 5 bits with the operation state space-time similarity value of WT0, are selected as the similar groups, and their similarity quantified values γ are shown in table 2.

TABLE 2 Performance test of LSTM model based on different crew data training in WF1

FIG. 9 (b) shows a quantized radar chart of the similarity of WT0 to the five-station set, which shows that the variables of the selected sets all have a similarity of more than 0.8 to the corresponding variable of WT 0.

When units with similar operating conditions are determined, different LSTM models may be trained using SCADA data for those units. The input variables of the training model comprise the pretreated gear box oil temperature, the pretreated gear box rear bearing temperature, the pretreated generator front bearing temperature, the pretreated wind speed and the pretreated cabin temperature of each unit; the output variable is the estimated value of the temperature of the front bearing of the gear box of the unit; after model training is completed, performance evaluation needs to be performed on the accuracy of each LSTM model and the adaptability to the WT0 data by using the test data set of the units and the verification data set of the WT 0. The results of the two performance evaluations for each model are shown in table 2. When the LSTM model trained based on different unit data is used for carrying out accuracy test by using own verification data, each model shows good accuracy. However, the composite score β for all models was significantly reduced when the adaptation test was performed using the validation data for WT 0. This indicates that the different models are differently adapted to the SCADA data of WT 0. In addition, from table 2, it can be seen that the LSTM model corresponding to the group having strong similarity to WT0 is also relatively good in adaptability.

Further, according to the results shown in Table 2, the LSTM model which performs well in both performance tests was selected for the next Combined State Estimation Model (CSEM) construction, and these sub-models were obtained by SCADA training using WT0, WT6, WT11 and WT23, respectively. In order to verify whether the operation state similarity unit is beneficial to improving the reliability of the abnormal detection result, three different CSEM construction schemes shown in table 3 are respectively designed, and the performance of the three different CSEM is tested by using the WT0 verification data.

TABLE 3 different combination strategies for CSEM

Fig. 10 shows the comparison result of the estimated front bearing temperature and the actual temperature of the gear case based on different CSEM schemes in WF1, and table 4 shows the performance indexes of different CSEM and the corresponding residual maximum effective values and information entropy.

TABLE 4 CSEM Performance test results based on different schemes in WF1

	E _mae	E _rmse	R ²	β	H _rm	H _en
							Sch1	0.319	0.377	0.967	1.389	0.865	0.328
Sch2	0.331	0.408	0.958	1.296	0.937	0.351
							Sch3	0.291	0.323	0.978	1.593	0.543	0.296

From fig. 10, it can be seen that the temperature curve of CSEM estimation constructed based on Sch3 is closest to the actual temperature curve, and the performance evaluation index of the CSEM estimation is better than that of the other two schemes, which means that using SCADA data of a plurality of similar wind turbines to construct a combined estimation model is beneficial to improving the accuracy of the CSEM estimation result. In addition, it can also be seen from FIG. 10 that the performance metrics of CSEM in Sch1 and Sch2 are also very similar, indicating that using data from multiple similar units, it is also possible to construct an effective CSEM to estimate the state variable of WT 0.

Further, based on these three schemes, the abnormal state of the front bearing of the gear box is detected, and the detection results are shown in fig. 11 (a) to 11 (c). The three subgraphs in fig. 11 (a) to 11 (c) represent detection results of three different schemes respectively, and a true value and an estimated value of the bearing temperature in front of the gearbox are respectively shown on the left side of each subgraph by red and blue curves, wherein a red dotted line represents abnormal alarm time recorded by the SCADA system; the right side of each sub-graph shows the final anomaly detection result based on the comparison of the residual effective value and residual information, wherein the blue curve and the red curve respectively represent the effective value Rm and the information entropy En of the residual, the magenta and the cyan horizontal lines represent the corresponding thresholds Hrm and Hen, and the green shadow area is the identified bearing temperature anomaly period.

As can be seen from each of the subgraphs of fig. 11 (a) to 11 (c), the period in which the observed bearing temperature estimated value on the right side is greatly different from the true value is substantially the same as the residual data anomaly period detected on the left side, which verifies the effectiveness of the proposed anomaly detection method. By comparing the detection results of the three schemes, the earliest detection time of bearing temperature abnormality by Sch3 is 2019/5/296:00 is 14 days earlier than system fault alarm, which shows that the SCADA data sequence model of the WT0 and other similar units can be comprehensively utilized to improve the CSEM sensitivity, thereby being beneficial to early finding anomalies. In addition, compared with the abnormal detection results of Sch1 and Sch2, the abnormal detection results of the bearing temperature before the gearbox can be found out, the time for detecting the abnormal temperature of the bearing before the gearbox is almost the same in the earliest of the two schemes, and the abnormal detection of the unit to be detected is 10 days earlier than the system fault alarm, which shows that the use of SCADA data of the unit similar to the running state of the unit to be detected is feasible.

(2) Detection of abnormal oil temperature of a gearbox based on WF2 wind farm data:

according to table 1, the available data of the WF2 wind farm are relatively small, so it is more practical to improve the abnormality detection accuracy by using the data of the similar units in the same wind farm. And selecting 2018/2/1 to 2018/3/23 data to quantitatively evaluate the space-time similarity of the running states of each unit in WF2 and WT 0. The total evaluation data of each unit is 14688, and the effective evaluation data after cleaning is 12727. The wind speed, the main shaft rotating speed and the output power are selected as macroscopic variables, and the gearbox oil temperature, the gearbox rear bearing temperature, the gearbox high-speed shaft rotating speed, the gearbox inlet oil temperature and the gearbox front bearing temperature are selected as microscopic variables.

FIG. 12 (a) shows the results of the space-time similarity quantification of the operation states of each of the units of WF2 and WT0, and five units of the units based on WT08, WT05, WT01, WT15 and WT03 were selected as similar units, and FIG. 12 (b) shows a radar chart of the similarity of the five units to WT 0.

TABLE 5 Performance test of LSTM model based on different crew data training in WF2

TABLE 6 CSEM Performance test results based on different schemes in WF2

Table 5 gives results of two performance tests of LSTM based on WT0 and these five unit trains, it can be seen that the overall performance of the LSTM model trained based on each unit data is slightly worse than the model shown in Table 2, since WF2 provides relatively less data than WF 1. However, the results reflected in the data in tables 5 and 2 are substantially identical, i.e. the accuracy test is good for all units, but the adaptability test is significantly reduced. Similarly, according to the results of the two performance tests, an LSTM model with relatively good performance is selected as a sub-model to construct CSEM, and the selected sub-models are respectively obtained based on SCADA data training of WT0, WT01, WT05 and WT 08. Table 6 shows the performance indexes of different CSEM constructed based on three different schemes, and the corresponding maximum effective value of residual error and information entropy, and it can be found that all indexes of CSEM constructed by fully utilizing similar wind turbine generator data are improved.

(3) And (3) contrast verification:

to verify the advantages of the proposed method, four different anomaly detection methods were further employed as comparative models to verify the advantages of the proposed method. The methods respectively utilize Multiple Linear Regression (MLR), support Vector Machines (SVM), deep Neural Networks (DNN) and multi-model combination estimation (MMC) to construct state estimation models of target variables. The four comparative validation models each trained the model using only the WT0 historical data, and the input and output variables of the model remained consistent with the previous examples. It should be noted that, the MMC is different from the constructed CSEM, and is a combination estimation model constructed based on linear combination, but the MMC is a sub-model constructed based on different algorithms, all the sub-models are obtained by training data of WT0, and the combination weight of the MMC is obtained by setting the least square sum of estimation residuals of all the sub-models as an objective function, and the objective function is needed to be solved by an optimization algorithm.

Based on the data provided by WF1 and WF2, the performance of the CSEM constructed in the four comparative models and Sch3 was compared. Fig. 14 (a) and 14 (b) show the performance evaluation results of each model obtained by using different wind farm WT0 verification data sets, and for convenience of comparison of the composite score β, the β indexes of all models in fig. 14 (a) and 14 (b) are normalized values, and the normalized base is the largest β value in the five models. It can be found that the performance of the combined model MMC and CSEM is better than the other models, and that the performance differences between the different models may be related to the total amount of training data. When the training data of WT0 in WF1 is sufficient, the difference in performance evaluation index between different models is small, as shown in fig. 14 (a); when there is less training data for WT0 in WF2, there is a significant decrease in performance for all models except CSEM, as shown in fig. 14 (b). The reason that CSEM can maintain good performance under the condition of insufficient WT0 training data is that a plurality of wind turbines with similar running states are used in the construction process of the CSEM to relieve the influence of insufficient data on the performance of the model.

Next, the WT0 detection data sets of the two wind farms are used to compare the anomaly detection effects of the different methods. For ease of comparison, all methods employ the anomaly identification methods presented herein. Fig. 15 (a) and 15 (b) show the abnormality detection results of different methods. It was found that when abnormality detection was performed for WT0 in WF1, the detection results of CSEM and MMC in fig. 15 (a) were relatively close due to sufficient training data, and both could detect the abnormal temperature of the front bearing of the gear box 14 days earlier than the system failure alarm. Moreover, both methods based on combined model estimation detect anomalies earlier than other methods based on single model estimation. However, when the training data of WT0 in WF2 is small, the abnormality detection results of the different models shown in fig. 15 (b) are significantly different, but CSEM still detects abnormality of the oil temperature of the gear box earlier than other methods. The comparison experiment shows that the method is more sensitive to the abnormality, and the accuracy of the abnormality detection result can be ensured even when the training data of the wind turbine generator is insufficient.

(4) Verification of detection effect under abnormal conditions of training data:

in order to verify the robustness of the proposed anomaly method, the anomaly detection reliability of the different methods in the presence of anomaly data in the training data is further compared. As can be seen from the anomaly detection results shown in fig. 15 (a) and 15 (b), before the SCADA system fails to alarm, the target variables of both the two failure units are detected to have anomalies in the data of a plurality of time periods. Thus, this comparison experiment will use data in the detection dataset that partially contains outlier data to train the estimation model. Since the detection data in WF2 is smaller, only WT0 detection data in WF1 was used in this comparative experiment. Taking 12774 data records from 2019/5/25 to 2019/6/5 as a new training set, taking the data from 2019/6/6 to 2019/6/15 as new detection data, and combining the abnormal detection results of the WT0 in WF1, wherein the abnormal data proportion in the new training data is 26.4%. All four comparison methods only use the historical data of WT0 in 2019/5/25 to 2019/6/5 to train the model, and CSEM is obtained by training the data of the same WT0 and other three similar units in the period. Since the comparison experiment has no verification data set, the abnormality detection threshold H _rm And H _en Is determined using the new training data set.

Fig. 16 (a) and 16 (b) show anomaly detection results of different methods, wherein fig. 16 (a) is a comparison of an estimated value and a true value, and fig. 16 (b) is an identified anomaly, and it can be seen that there is a significant negative effect on anomaly detection when anomaly data exists in training data. However, CSEM maintains good accuracy in this case compared to other methods, not only detecting anomalies earliest, but also detecting periods of anomalies most prior to system failure alarms. Comparing the results of anomaly detection in the different methods shown in FIG. 16 (a) and FIG. 16 (b), it was found that the results of anomaly detection in CSEM constructed using training data with normal and abnormal values, respectively, were approximately the same after 2019/6/6, but that the results of anomaly detection in MLR, SVM, DNN and MMC were significantly different after 2019/6/6, and that three methods MLR, SVM, MMC identified normal temperature data as anomalies, and DNN even detected no system failure alarm to anomaly temperatures. The results prove that the method can effectively relieve the influence of the abnormal data contained in the training data of the target unit on the abnormal detection result.

The invention introduces the principle of how to realize the abnormal detection of the gear box of the target unit by using SCADA data of other units of the same wind power plant in detail, and describes the specific implementation steps of the detection by combining practical examples, and the detection result shows that the method provided by the invention can detect the abnormal operation state of the gear box earlier than the SCADA system fault alarm under the condition that the historical data of the target wind power plant is insufficient or missing, thereby providing reference for the operator to make a maintenance and overhaul plan.

Claims

1. The wind turbine generator system gearbox anomaly detection method considering the similarity of the running states of multiple wind turbine generators is characterized by comprising the following steps:

step 6: selecting a plurality of state estimation LSTM models with better performance in the verification in the step 5, and constructing a combined state estimation model in a weighted combination mode;

2. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: the step 1 comprises the following steps:

s1.2, the long time sequence is divided into different short time sequences by adopting a linearization division method, and the state variable time sequences L1 and L2 after linearization division can be approximately used as h (h>2) Segment sum k (k)>2) The broken line of the section represents that the breaking point is taken as the abrupt point of the numerical values on the two curves; the broken line after division has three kinds of change trends which only rise, fall and remain unchanged, and the three kinds of change trends are respectively represented by 1, -1 and 0, so that the state variable time sequence curve can be approximated by the data set S ₁ And S is ₂ The representation is:

Elements in a collection

And->

Respectively representing an h-th segment linearization broken line of the curve L1 and a k-th segment linearization broken line of the curve L2; the time-structured sets corresponding to the linearization division points are respectively marked as

And->

Wherein: -a->

representing the moment of the time sequence broken line L2 corresponding to the kth segment point;

And->

Expressed as:

Amplitude trend is +.>

Amplitude trend is +.>

At this time, the state variable time series curve is divided intoA plurality of fold line segments which are equal in number and identical in dividing point; s1.4 degree of similarity S of the state variables according to equation (4) _ST And (3) performing calculation:

in the formula (4): s is S _ST The value range of (2) is [0,1 ]]The larger the value thereof is, the greater the degree of similarity of the two state variables is;

The change amplitude of the ith line segment after the linear segmentation of the curve L2 is represented;

3. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: the step 2 comprises the following steps:

in formula (5): s is S _a Is the area of the hatched portion,

Similarity for State variable 1, +.>

Similarity of the (i + 1) th state variable,

similarity for the n+1th state variable;

4. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: the step 4 comprises the following steps:

s4, 1, constructing a state estimation model of a target variable by adopting LSTM:

in formula (6): w (W) _f And b _f Respectively the weight and the bias term of the forgetting gate; w (W) _i And b _i The weight and the bias items of the input gate are respectively; w (W) _o And b _o The weight and the bias term of the output gate are respectively; w (W) _c And b _c Respectively a weight and a bias item of the current state; g (,) represents a gate function; as indicated by the multiplication of the corresponding position elements of the matrix, h _t-1 An implicit layer variable representing the time t-1, which has a short-term memory function, C _t-1 And C _t Cell layer variables respectively representing time t-1 and time t, which have long-term memory function, X _t An input variable representing time t;

S4.2, constructing a state estimation model of the wind turbine by adopting a two-layer LSTM layer stacking mode; after training a plurality of state estimation LSTM models by using SCADA data of the unit to be detected and other units with similar running states of the unit to be detected, the state estimation LSTM model with relatively good performance is used for constructing a combined state estimation model CSEM.

5. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: in the step 5, the performance evaluation criteria of all the state estimation LSTM models are measured by a comprehensive score index β, and the calculation formula is as follows:

/>

in the formula (7): y is _i And

6. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: in the step 6, the combined state estimation model CSEM is constructed by adopting a weighted combination mode, and the specific method is as shown in the formula (8):

in formula (8): m is M _i Representing an ith LSTM sub-model; m is the number of selected LSTM submodels; m is M _CSEM Represents CSEM; gamma ray _i The similarity quantized value of the wind turbine generator set is calculated based on the formula (5); the combined weight is related to the space-time similarity of the running states of the wind turbines, namely, the more similar the running states of the wind turbines in the same wind power plant are to-be-detected, the greater the contribution of a model trained based on the data of the wind turbines to CSEM is;

7. The method for detecting the abnormality of the gearbox of the wind turbine generator by considering the similarity of the running states of the multiple wind turbine generators according to claim 1 is characterized by comprising the following steps: the step 7 comprises the following steps:

in the formula (9): i is the number of statistical intervals divided based on residual distribution; d, d _i A number of residual samples that fall within an ith statistical interval; p is p _i The proportion of the corresponding statistical interval sample to the detection sample is calculated; alpha determines the unit of measure of the information entropy, and usually takes the value of alpha as natural logarithm 'e'; t (T) _inv Is the sampling interval of the SCADA system; t (T) _r Is a sliding window miningWindow width used, which reflects the time scale of detection;

s7.2: three statistical intervals are set: the normal area, the risk area and the high risk area are used for calculating the information entropy of each detection period, the three statistical intervals are not continuous, and the boundary values are four important parameters Q based on a box-type diagram _low ,Q ₁ ,Q ₃ And Q _up Determined, wherein: q (Q) ₁ And Q ₃ 0.25 quantiles and 0.75 quantiles of the box plot respectively; q (Q) _up And Q _low The upper and lower boundaries in the box plot, respectively, whose values are calculated by equation (10);

When the residual value is greater than Q ₁ And is smaller than Q ₃ When the residual error falls in the normal region; when the residual value is smaller than Q _low Or is greater than Q _up When the residual error falls in the high risk area; the residual values fall in the risk areas, and the residual errors of the high risk areas deviate from the overall normal distribution range obviously;

8. The state variable similarity comparison method based on piecewise linearization is characterized by comprising the following steps of:

a1, a state variable time sequence curve provided with two wind turbines is L1 and L2 respectively, and the state variable time sequence curves comprise L sampling points;

A2, dividing the long time sequence into different short time sequences by using a linearization division method, wherein the state variable time sequences L1 and L2 after linearization division can be approximated by h (h>2) Segment sum k (k)>2) The broken line of the section represents that the breaking point is taken as the abrupt point of the numerical values on the two curves; the broken line after division has three kinds of change trends which only rise, fall and remain unchanged, and the three kinds of change trends are respectively represented by 1, -1 and 0, so that the state variable time sequence curve can be approximated by the data set S ₁ And S is ₂ The representation is:

elements in a collection

And->

And->

Wherein (1)>

Representing the moment of time-series polyline L1 corresponding to the h-th segmentation point,/and >

a3, performing secondary linearization segmentation on the basis of the primary linearization segmentation of the A2 to ensure that linearization segmentation points of the curves L1 and L2 are identical; the second time of dividing the point set T is implemented by the method of dividing the point set T into two parts ^s1 And T ^s2 After the union of the numbers, each division point is divided into the time sequence is arranged in sequence to obtain the product, the expression is as follows:

And->

Expressed as: />

Amplitude trend is +.>

Amplitude trend is +.>

At this time, the state variable time series curve is divided into a plurality of broken line segments which are equal in number and identical in dividing point;

A4 degree of similarity S of the state variables according to equation (4) _ST And (3) performing calculation:

9. The method for quantifying the space-time similarity of the running states of the wind turbine by comprehensively considering the similarity of the macroscopic state and the microscopic state of the wind turbine is characterized by comprising the following steps:

b1, selecting a plurality of state variables capable of comprehensively reflecting the similarity of the macroscopic state and the microscopic state of the wind turbine generator, and calculating the similarity of each state variable according to a formula (5);

in formula (5): s is S _a Is the area of the hatched portion,

for the similarity of the ith state variable, S ₀ Is the area of a regular polygon, and n is the number of selected state variables; / >

Similarity for State variable 1, +.>

Similarity of the (i + 1) th state variable,

similarity for the n+1th state variable;

b2, quantifying the space-time similarity of the running state of the wind turbine generator based on a radar chart area comparison method;

firstly, constructing a regular polygon radar chart with the same number of the state variables selected in the step B1, defining that each axis of the radar chart represents one state variable, and setting the reference length of the axis to be 1;

secondly, drawing an arrow outwards along the axis from the central position of the regular polygon radar chart as a starting point, wherein the length of the arrow represents the similarity degree of the corresponding state variables of the two wind turbine generator systems;

b2.3: finally, all arrow endpoints are sequentially connected to form a closed graph, namely a radar graph shadow part, and the proportion of the area of the closed graph to the area of the radar graph represents the space-time similarity of the running states of the two wind turbine generator systems.

10. The abnormal data identification method based on target variable residual effective value comparison and residual information comparison is characterized by comprising the following steps of:

step1: through sliding window sampling, dividing a long-time sequence of a target variable of a unit to be detected into a plurality of short-time sequences, and performing anomaly detection, wherein R of each short-time sequence _m And E is _n The value is calculated according to equation (9):

in the formula (9): i is the number of statistical intervals divided based on residual distribution; d, d _i To fall onThe number of residual samples in the ith statistical interval; p is p _i The proportion of the corresponding statistical interval sample to the detection sample is calculated; alpha determines the unit of measure of the information entropy, and usually takes the value of alpha as natural logarithm 'e'; t (T) _inv Is the sampling interval of the SCADA system; t (T) _r The window width used for the sliding window reflects the time scale of detection;

step2: three statistical intervals are set: the normal area, the risk area and the high risk area are used for calculating the information entropy of each detection period, the three statistical intervals are not continuous, and the boundary values are four important parameters Q based on a box-type diagram _low ,Q ₁ ,Q ₃ And Q _up Determined, wherein: q (Q) ₁ And Q ₃ 0.25 quantiles and 0.75 quantiles of the box plot respectively; q (Q) _up And Q _low The upper and lower boundaries in the box plot, respectively, whose values are calculated by equation (10);

step3: firstly, respectively calculating an effective value and an information entropy of a residual error sample in a detection period; second, the effective value of the residual error of the detection period is not more than the threshold H _rm If the threshold value is not exceeded, the residual distribution is in a normal range, if the effective value is exceeded, the residual of the sample is larger, and the abnormal condition is caused by individual extremely large noise or caused by a large number of residual abnormalities, which is further judged by the information entropy of the residual; if the entropy of the residual is also less than a given threshold H _en The large probability of the larger effective value is caused by abnormal operation of the detection object, otherwise, the effective value is caused by noise interference.