CN112733446A - Data-driven self-adaptive anomaly detection method - Google Patents

Data-driven self-adaptive anomaly detection method Download PDF

Info

Publication number
CN112733446A
CN112733446A CN202110016872.7A CN202110016872A CN112733446A CN 112733446 A CN112733446 A CN 112733446A CN 202110016872 A CN202110016872 A CN 202110016872A CN 112733446 A CN112733446 A CN 112733446A
Authority
CN
China
Prior art keywords
data
fan
generator
information
bearing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110016872.7A
Other languages
Chinese (zh)
Inventor
赵莹莹
俞杰
尚笠
顾宁
卢暾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202110016872.7A priority Critical patent/CN112733446A/en
Publication of CN112733446A publication Critical patent/CN112733446A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/06Wind turbines or wind farms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/08Thermal analysis or thermal optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Control Of Eletrric Generators (AREA)

Abstract

The invention discloses a data-driven self-adaptive anomaly detection method; the method is applied to the abnormity detection of the bearing of the fan generator. The anomaly detection method is realized by adopting an unsupervised learning technology And only based on a Data Acquisition And monitoring Control (SCADA) system which is installed according to the fan standard. The algorithm is based on the assumption that a single fan which runs for a long time is in a normal running state most of the time, and a model of the normal running state of the fan can be established without additional and expensive data labels, so that possible bearing abnormity can be detected. Experiments prove that the accuracy of the algorithm for detecting the abnormity of the bearing of the fan and the generator reaches more than 80%, and the recall ratio reaches more than 50%.

Description

Data-driven self-adaptive anomaly detection method
Technical Field
The invention relates to the technical field of data-driven abnormity detection in a new energy power generation system, in particular to a data-driven self-adaptive abnormity detection method.
Background
Wind power generation is one of clean energy sources, and is globally favored due to the characteristics of low cost, renewability and the like. In recent years, with the rapid increase of the number of wind power stations and installed capacity, how to ensure the safety and reliability of the system and reduce the operation and maintenance cost has become a serious challenge. This is because the shutdown caused by the failure of the fan equipment not only causes the equipment shutdown, but also brings expensive operation and maintenance cost. More importantly, these faults may cause secondary accidents of the fan, inducing various safety hazards, which greatly reduces the safety and reliability of the system.
Researches show that the effective anomaly detection technology can timely find the fault of the fan system in an early stage, and intervene operation and maintenance activities in an early stage, so that the safety and reliability of the system are improved, and the system downtime is reduced. The existing method for detecting the abnormality of the fan system mainly comprises a data driving method and a model-based method. The data driving method is based on data acquired by the existing data acquisition system, namely an SCADA system, and adopts a supervised machine learning method to detect the abnormity. Specifically, the method comprises the steps of firstly acquiring data of a normal running state of the fan based on a supervised abnormality detection method, modeling the normal state of the fan, and then detecting whether the fan is abnormal or not by comparing the difference between the data of the fan in actual running and the data predicted and output by the model. The method has the advantages that the method based on supervised machine learning only uses data collected by the SCADA system installed in the factory standard of the fan to carry out abnormity detection, does not need too much domain knowledge, and is easy to deploy. The limitation is that existing supervised learning based data-driven approaches require large amounts of expensive tag data. With respect to actual operating fan systems, label data is often very small and collection costs are extremely high. Model-based methods typically build a specific physical model based on domain a priori (physical) knowledge to describe the relationship between the physical signals when the wind turbine is running, detect if there is an anomaly by comparing the residual between the predicted signal and the actually measured signal, and determine the severity of the anomaly by the magnitude and duration of the residual. The model-based method has high requirements on the knowledge in the professional field, namely the physical principle of fan operation. In addition, in order to establish an accurate physical model, additional data acquisition besides the SCADA system is often required, which greatly limits the popularization and application of the method, because the additional data acquisition brings additional operation and maintenance costs. The invention provides a data-driven and self-adaptive anomaly detection algorithm based on data acquired by an SCADA system which is installed in the existing fan standard and by adopting an unsupervised learning method.
Disclosure of Invention
The invention provides a data-driven and self-adaptive anomaly detection algorithm for a fan generator bearing. The algorithm is realized by adopting an unsupervised learning technology And only based on a Supervisory Control And Data Acquisition (SCADA) system which is installed according to the fan standard, And the safety And the reliability of the operation of the new energy power generation system can be effectively improved by the detection algorithm based on the Data acquired by the SCADA system.
The technical scheme of the invention is specifically introduced as follows.
A data-driven adaptive anomaly detection method specifically comprises the following steps:
step one, data preprocessing
Preprocessing raw data acquired by a data acquisition and monitoring control SCADA system, wherein the preprocessing comprises three processes of feature extraction, data cleaning and data normalization; wherein: in the characteristic extraction process, an SCADA system which is installed in a fan standard mode selects the following measuring point data which are related to the operating state of a generator shaft when the fan operates at a certain sampling interval: wind speed, active power, cabin temperature, three-phase temperature of a generator, front bearing temperature of the generator and rear bearing temperature of the generator, and manual characteristics, namely the difference value of the front bearing temperature and the rear bearing temperature of the generator, are created;
step two, model training
(1) Normal state modeling
Based on all histories of the operation of a single fan, a model is established by adopting a long-short term memory LSTM network so as to express the relationship between the temperature difference of a front bearing and a rear bearing, the ambient temperature, the active power and the three-phase temperature of a generator.
The LSTM is suitable for time series prediction of the temperature difference between the front and rear of the generator bearing, where x represents the input of the LSTM block, W represents the weight, b represents the offset vector, o is the determination condition obtained by the output gate, h is the LSTM block output, and C is the LSTM block state information.
The first step of the LSTM is to use the ambient temperature, active power, and three-phase temperature of the generator at a certain time as input x, and determine which information the neuron needs to discard at the time t through a Sigmoid function σ called forgetting gate, that is, discard the information that is not needed in the ambient temperature, active power, and three-phase temperature of the generator at the current time, and the process is shown in formula (2);
ft=σ(Wf·[ht-1,xt]+bf), (2)
wherein h ist-1Representing the output of the previous LSTM block, bfRepresenting a bias vector;
the second step of LSTM is to decide which new information to add to the neuron, first, using ht-1And xtDetermining information to be updated through an input gate; then ht-1And xtObtaining new neuron information through a tanh layer
Figure BDA0002887007550000021
Wherein
Figure BDA0002887007550000022
New state information representing the current neuron generated according to the result output by the previous neuron and the input three-dimensional features, wherein the information is possibly updated to the neuron information;
it=σ(Wi·[ht-1,xt]+bi). (3)
Figure BDA0002887007550000031
the third step of the LSTM will pass the previous neuron state information Ct-1Calculating current neuron state information Ct(ii) a The calculated rule is that a part of the information of old neurons is forgotten through forgetting gate selection, and the information of candidate neurons is added through input gate selection
Figure BDA0002887007550000032
Get new neuron information CtThe neuron information at this time is fused with the three-dimensional feature information of the current time of input.
Figure BDA0002887007550000033
Finally according to the input ht-1And xtTo determine which state features of the output neuron, the input needs to be passed through a Sigmoid function called output gate to obtain the determination condition, and then the current neuron state C is determinedtObtaining a vector through the tanh layer, multiplying the vector by the judgment condition obtained by the output gate to obtain the output of the final unit, finishing the calculation of the neural network by the ambient temperature, the active power and the three-phase temperature of the generator at the current moment, and reserving part of the information of the self to output htAs shown in equation (6).
ht=ot*tanh(Ct), (6)
Wherein o ist=σ(Wo·[ht-1,xt]+bo)
Using the mean square error MSE as a loss function, as shown in equation (7):
Figure BDA0002887007550000034
wherein T istThe actual front and rear bearing temperature difference at the positive time t,
Figure BDA0002887007550000035
the predicted temperature difference of the front bearing and the rear bearing at the time t through the LSTM, and N is a predicted time period;
(2) storing normal state models
Storing the LSTM trained model based on historical data;
step three, anomaly detection
(1) Residual prediction
For each fan, detecting the generator bearing for a period of time, predicting the temperature difference between the front bearing and the rear bearing by using an established model in a normal state, and comparing the temperature difference with an actual residual error to calculate the residual error;
(2) assessment of severity of abnormalities
Firstly, removing noise of an obtained residual error by adopting median filtering; then, averaging the residual errors according to a time period to be detected, and averaging predicted residual errors according to the day according to the requirement of operation and maintenance real-time performance; secondly, for a single fan, the more deviated the value of most residual errors, the higher the deviation degree is, the higher the severity of the possible abnormality is, and the probability probablity of the abnormality evaluation index is as shown in formula (8):
Probablility=|2*F(r;μ,σ)-1|, (8)
wherein,
Figure BDA0002887007550000041
mu is mean value, sigma is standard deviation;
(3) precision and Recall calculation
Firstly, FN, FP, TP and TN are defined, wherein TP is defined as a fan with an abnormal generator bearing and is correctly detected before abnormal shutdown; FP is defined as that the fan which normally runs is wrongly detected as bearing abnormity, TN is defined as that the fan which normally runs correctly judges that the fan is in a normal running state, and FN is defined as that the fan which generates the generator bearing abnormity is not detected in the same day;
equations (9) and (10) define the accuracy and recall calculation methods, respectively.
Figure BDA0002887007550000042
Figure BDA0002887007550000043
(4) Analysis of the relationship of thresholds for severity of anomalies to Precision and Recall
And setting a threshold value of the abnormal severity, analyzing the influence of the change of the threshold value on the accuracy and the recall ratio, determining the abnormal severity, and generating a result to recommend to operation and maintenance personnel.
Compared with the prior art, the invention has the beneficial effects that
The abnormity detection method is based on the assumption that a single fan which runs for a long time is in a normal running state in most of time, and a model of the normal running state of the fan can be established without an additional and expensive data tag, so that possible bearing abnormity is detected, and the running safety and reliability of the new energy power generation system are improved. Experiments prove that the accuracy of the algorithm for detecting the abnormity of the bearing of the fan and the generator reaches more than 80%, and the recall ratio reaches more than 50%.
Drawings
FIG. 1 is a flow chart of a data-driven adaptive anomaly detection method of the present invention.
Detailed Description
The technical solution of the present invention is described in detail below with reference to examples.
Example 1
In the embodiment, a data-driven adaptive anomaly detection method is provided, and the algorithm firstly selects and creates features based on the experience of an industry expert. And then, modeling the historical collected data of the fan by using the time sequence correlation of the data collected by the SCADA system when the fan runs and adopting a Long Short-Term Memory network (LSTM). And then, applying the established model to the operation data to be detected, and determining whether the bearing of the fan and the generator is abnormal or not by calculating the residual error between the predicted value and the true value. And finally, determining the severity of the abnormity by the proposed severity evaluation method, and generating a result to recommend to operation and maintenance personnel. The method comprises the following specific steps:
(1) data pre-processing
The patent proposes that the raw data collected by the SCADA system is preprocessed to ensure the performance of model detection. The data preprocessing mainly comprises the following steps: feature extraction, data cleaning and data normalization. The method comprises the following specific steps:
step one feature extraction
The SCADA system which is installed in a fan standard mode collects data of hundreds of measuring points when the fan runs at sampling intervals of 10 minutes. However, not all measurement data is suitable for detecting the abnormality of the generator bearing, and excessive data usage may adversely impair the effect of abnormality detection. Therefore, it is necessary to extract effective features for accurate abnormality detection.
The following measuring point data possibly related to the carrying state of the generator shaft are selected according to expert knowledge: wind speed, active power, cabin temperature, generator three-phase temperature, generator front bearing temperature, and generator rear bearing temperature. And creates a manual signature, i.e. the difference in bearing temperatures before and after the generator. This is because under the normal operation condition of the wind turbine generator bearing, there is a certain unknown relationship between the temperature difference between the front and rear bearings and the current operation environment and state of the wind turbine (i.e. ambient temperature, active power, and three-phase temperature of the generator). If the fore-aft bearing temperature difference deviates from this relationship, the bearing may be abnormal.
Step two data cleaning
The original data collected by the SCADA system often has the problem of data quality due to the limitation of external environment, interference and sensor sensitivity, such as noise, dead number, data jump, repetition value, null value, etc. Data quality issues tend to affect the performance of the model and, therefore, efficient data cleansing is required. Data cleaning first removes data that exceeds the theoretical range, removes null data, and removes duplicates. Second, the presence of dead counts is detected. The dead number is defined herein as the bearing temperature remaining constant over several hours before and after the generator. If there are dead numbers, they are removed. And finally, removing noise through a median filtering algorithm.
Data normalization in step three
To eliminate the problem of large data range variation under different dimensions, data is usually normalized to improve the performance of the model. Here, the maximum-minimum normalization method is used, as shown in equation (1):
Figure BDA0002887007550000051
wherein: x represents data before normalization, and x' represents data after normalization.
(2) Model training
Step one Normal State modeling
For a single fan running for a long time, the single fan should be in a normal running state most of the time. Therefore, in the face of the actual condition that actual fan operation data is not labeled, the method adopts a machine learning method to establish a model based on all histories of the operation of a single fan so as to express the relationship between the temperature difference of the front bearing and the rear bearing, the ambient temperature, the active power and the three-phase temperature of the generator. Because the historical data of the fan operation has a time sequence relation, the invention adopts a long-short term memory (LSTM) network to establish a model.
LSTM is a time-periodic neural network suitable for processing and predicting significant events at relatively long intervals and time-series delays. It not only uses the output of each neuron as an input to the next neuron, but it also passes the state of each neuron to the next neuron. Compared with a common circulating neural network, the LSTM avoids the problems of gradient disappearance and gradient explosion, so the LSTM is suitable for the time sequence prediction problem aiming at the temperature difference between the front and the rear of the generator bearing. Here, x denotes an input of the LSTM block, W denotes a weight, b denotes an offset vector, o denotes a determination condition obtained by an output gate, h denotes an LSTM block output, and C denotes LSTM block state information.
The first step of the LSTM is to use the ambient temperature, active power, and three-phase temperature of the generator at a certain time as input (x), and determine which information the neuron needs to discard at the time t through a Sigmoid function (σ) called forgetting gate, that is, discard the information that is not needed in the ambient temperature, active power, and three-phase temperature of the generator at the current time, and formula (2) shows the process.
ft=σ(Wf·[ht-1,xt]+bf), (2)
Wherein h ist-1Representing the output of the previous LSTM block, bfRepresenting the bias vector.
The second step of LSTM is to decide which new information to add to the neuron. First, using ht-1And xtThe information to be updated is determined by an input gate. Then ht-1And xtObtaining new neuron information through a tanh layer
Figure BDA0002887007550000061
Wherein
Figure BDA0002887007550000062
New state information representing the current neuron generated based on the result of the previous neuron's output and the three-dimensional features of the input may be updated into the neuron information.
it=σ(Wi·[ht-1,xt]+bi). (3)
Figure BDA0002887007550000063
The third step of the LSTM will pass the previous neuron state information Ct-1Calculating current neuron state information Ct. The calculated rule is that a part of the information of old neurons is forgotten through forgetting gate selection, and the information of candidate neurons is added through input gate selection
Figure BDA0002887007550000064
Get new neuron information CtThe neuron information at this time is fused with the three-dimensional feature information of the current time which is input by us.
Figure BDA0002887007550000071
Finally according to the input ht-1And xtTo determine which state features of the output neuron, the input needs to be passed through a Sigmoid function called output gate to obtain the determination condition, and then the current neuron state C is determinedtObtaining a vector through the tanh layer, multiplying the vector by the judgment condition obtained by the output gate to obtain the output of the final unit, finishing the calculation of the neural network by the ambient temperature, the active power and the three-phase temperature of the generator at the current moment, and reserving and outputting part of the information of the vector (h) (ht) As shown in equation (6).
ht=ot*tanh(Ct), (6)
Wherein o ist=σ(Wo·[ht-1,xt]+bo).
The present invention uses the Mean Square Error (MSE) as a loss function, as shown in equation (7):
Figure BDA0002887007550000072
wherein T istThe actual front and rear bearing temperature difference at the positive time t,
Figure BDA0002887007550000073
the fore and aft bearing temperature difference predicted by the LSTM at time t, N is the predicted time period.
Step two: storing normal state models
The LSTM trained models are stored based on historical data. Because a fan is in a normal operation state in most of the historical time, if an abnormality exists, the abnormal time is relatively short. Therefore, the model established for all historical data of a single fan can express the relationship between the temperature difference of the bearings before and after the normal operation of the fan, the environmental temperature and the operation state of the bearings.
(3) Anomaly detection
Step one residual prediction
And for each fan, detecting the generator bearing for a period of time, predicting the temperature difference between the front bearing and the rear bearing by using the established model in the normal state, and comparing the temperature difference with the actual residual error to calculate the residual error.
Step two anomaly severity assessment
First, median filtering is used to remove noise from the residual error. The filtering duration is empirically set to 2 hours. And then, averaging the residual errors according to a time period to be detected, and averaging the predicted residual errors according to the day according to the real-time operation and maintenance requirements, namely the daily operation and maintenance requirements. Secondly, for a single fan, we consider that most of the time is it working normally, so the more deviated the value of most of the residuals, the higher the possible anomaly is identified, and the larger the deviation, the higher the severity of the possible anomaly. The probability formula of the abnormal evaluation index is shown in formula (8):
Probablility|2*F(r;μ,σ)-1|,(8)
wherein,
Figure BDA0002887007550000081
μ is the mean and σ is the standard deviation.
Calculating accuracy (Precision) and Recall (Recall) in step three
To evaluate the effectiveness of the proposed detection algorithm, fn (false negative), fp (false positive), tp (true positive) and tn (true negative) are first defined. TP is defined as the fan where the generator bearing abnormality occurs, which is correctly detected before the abnormal shutdown. FP is defined as a fan that is operating normally and is falsely detected as a bearing anomaly. TN is defined as the fan in normal operation correctly determining that it is in a normal operating state. FN is defined as the fans that have had a generator bearing anomaly not detected the day.
Equations (9) and (10) define the accuracy and recall calculation methods, respectively.
Figure BDA0002887007550000082
Figure BDA0002887007550000083
Step four analysis of the relationship between the threshold of the severity of the anomaly and Precision and Recall
And setting a threshold value of the abnormal severity, and analyzing the influence of the change of the threshold value on the accuracy and recall ratio. Theoretically, a lower threshold would cause more fans to be detected as abnormal, thus resulting in a higher recall ratio, but at the same time affecting the accuracy of the monitoring, i.e. resulting in a lower accuracy. On the contrary, the setting of a higher threshold value can enable the fan with the possibility of serious abnormal degree to be detected, so that the accuracy rate can be higher, but on the other hand, the fan with the inconspicuous temperature difference change between the front bearing and the rear bearing is not detected by the excessively high threshold value, so that the recall ratio is not high. Therefore, the threshold value can be flexibly set according to the difficulty and the requirement of the actual operation and maintenance of the wind power station.
In a specific implementation case, the proposed anomaly detection algorithm is verified based on a certain domestic wind power station (the installed capacity is 48 MW). The SCADA system of the plant collected over one year of system operating data at 10 minute sampling intervals. According to the operation and maintenance record of the power station, a fan with a generator bearing fault, a fault shutdown time period and the normal operation time after operation and maintenance are determined, and the time period, the fault shutdown time period and the normal operation time are used as label data to verify the proposed data-driven and self-adaptive anomaly detection algorithm.
The experimental result shows that the accuracy and recall ratio of the change are obtained by setting different threshold values for the severity of the abnormality. Experiments show that the lower the threshold value is, the lower the accuracy of the anomaly detection is, and meanwhile, the higher the recall ratio is. Conversely, the higher the set threshold value, the higher the accuracy of anomaly detection, but the recall ratio will decrease, consistent with theoretical analysis. On the experimental wind power station, the accuracy can reach more than 80% and the recall rate reaches more than 50% by setting a reasonable threshold value.

Claims (5)

1. A data-driven adaptive anomaly detection method is characterized by comprising the following specific steps:
step one, data preprocessing
Preprocessing raw data acquired by a data acquisition and monitoring control SCADA system, wherein the preprocessing comprises three processes of feature extraction, data cleaning and data normalization; wherein: in the characteristic extraction process, an SCADA system which is installed in a fan standard mode selects the following measuring point data which are related to the operating state of a generator shaft when the fan operates at a certain sampling interval: wind speed, active power, cabin temperature, three-phase temperature of a generator, front bearing temperature of the generator and rear bearing temperature of the generator, and manual characteristics, namely the difference value of the front bearing temperature and the rear bearing temperature of the generator, are created;
step two, model training
(1) Normal state modeling
Based on all histories of the operation of a single fan, a model is established by adopting a long-short term memory LSTM network so as to express the relationship between the temperature difference of a front bearing and a rear bearing, the ambient temperature, the active power and the three-phase temperature of a generator.
The LSTM is suitable for time series prediction of the temperature difference between the front and rear of the generator bearing, where x represents the input of the LSTM block, W represents the weight, b represents the offset vector, o is the determination condition obtained by the output gate, h is the LSTM block output, and C is the LSTM block state information.
The first step of the LSTM is to use the ambient temperature, active power, and three-phase temperature of the generator at a certain time as input x, and determine which information the neuron needs to discard at the time t through a Sigmoid function σ called forgetting gate, that is, discard the information that is not needed in the ambient temperature, active power, and three-phase temperature of the generator at the current time, and the process is shown in formula (2);
ft=σ(Wf·[ht-1,xt]+bf), (2)
wherein h ist-1Representing the output of the previous LSTM block, bfRepresenting a bias vector;
the second step of LSTM is to determine the information to be added to the neuron, first, using ht-1And xtDetermining information to be updated through an input gate; then ht-1And xtObtaining new neuron information through a tanh layer
Figure FDA0002887007540000011
Wherein
Figure FDA0002887007540000012
New state information representing the current neuron generated according to the result output by the previous neuron and the input three-dimensional features, wherein the information is possibly updated to the neuron information;
it=σ(Wi·[ht-1,xt]+bi). (3)
Figure FDA0002887007540000013
the third step of the LSTM will pass the previous neuron state information Ct-1Calculating current neuron state information Ct(ii) a The calculated rule is that a part of the information of old neurons is forgotten through forgetting gate selection, and the information of candidate neurons is added through input gate selection
Figure FDA0002887007540000021
Get new neuron information CtThe neuron information at this time is fused with the three-dimensional feature information of the current time of input.
Figure FDA0002887007540000022
Finally according to the input ht-1And xtTo determine which state features of the output neuron, the input needs to be passed through a Sigmoid function called output gate to obtain the determination condition, and then the current neuron state C is determinedtObtaining a vector through the tanh layer, multiplying the vector by the judgment condition obtained by the output gate to obtain the output of the final unit, finishing the calculation of the neural network by the ambient temperature, the active power and the three-phase temperature of the generator at the current moment, and reserving part of the information of the self to output htAs shown in equation (6).
ht=ot*tanh(Ct), (6)
Wherein o ist=σ(Wo·[ht-1,xt]+bo)
Using the mean square error MSE as a loss function, as shown in equation (7):
Figure FDA0002887007540000023
wherein T istThe actual front and rear bearing temperature difference at the positive time t,
Figure FDA0002887007540000024
the predicted temperature difference of the front bearing and the rear bearing at the time t through the LSTM, and N is a predicted time period;
(2) storing normal state models
Storing the LSTM trained model based on historical data;
step three, anomaly detection
(1) Residual prediction
For each fan, detecting the generator bearing for a period of time, predicting the temperature difference between the front bearing and the rear bearing by using an established model in a normal state, and comparing the temperature difference with an actual residual error to calculate the residual error;
(2) assessment of severity of abnormalities
Firstly, removing noise of an obtained residual error by adopting median filtering; then, averaging the residual errors according to a time period to be detected, and averaging predicted residual errors according to the day according to the requirement of operation and maintenance real-time performance; secondly, for a single fan, the more deviated the value of most residual errors, the higher the deviation degree is, the higher the severity of the possible abnormality is, and the probability probablity of the abnormality evaluation index is as shown in formula (8):
Probablility=|2*F(r;μ,σ)-1|, (8)
wherein,
Figure FDA0002887007540000031
mu is mean value, sigma is standard deviation;
(3) precision and Recall calculation
Firstly, FN, FP, TP and TN are defined, wherein TP is defined as a fan with an abnormal generator bearing and is correctly detected before abnormal shutdown; FP is defined as that the fan which normally runs is wrongly detected as bearing abnormity, TN is defined as that the fan which normally runs correctly judges that the fan is in a normal running state, and FN is defined as that the fan which generates the generator bearing abnormity is not detected in the same day;
equations (9) and (10) define the accuracy and recall calculation methods, respectively.
Figure FDA0002887007540000032
Figure FDA0002887007540000033
(4) Analysis of the relationship of thresholds for severity of anomalies to Precision and Recall
And setting a threshold value of the abnormal severity, analyzing the influence of the change of the threshold value on the accuracy and the recall ratio, determining the abnormal severity, and generating a result to recommend to operation and maintenance personnel.
2. The adaptive anomaly detection method according to claim 1, wherein in step one, the sampling interval is 10 minutes.
3. The adaptive anomaly detection method according to claim 1, wherein in step one, the data cleansing method is as follows: for original data acquired by an SCADA system, firstly removing data exceeding a theoretical range, removing null data and removing repeated values by data cleaning; and secondly, detecting whether dead numbers exist or not, if so, removing the dead numbers, wherein the dead numbers are defined as data which are kept unchanged when the temperature of the front bearing and the rear bearing of the generator exceeds a plurality of hours, and finally removing the noise through a median filtering algorithm.
4. The adaptive anomaly detection method according to claim 1, wherein in step one, the data is normalized by using a maximum and minimum normalization method, as shown in formula (1):
Figure FDA0002887007540000034
wherein: x represents data before normalization, and x' represents data after normalization.
5. The adaptive abnormality detection method according to claim 1, characterized in that in step three (2), the filtering time period is 2 hours.
CN202110016872.7A 2021-01-07 2021-01-07 Data-driven self-adaptive anomaly detection method Pending CN112733446A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110016872.7A CN112733446A (en) 2021-01-07 2021-01-07 Data-driven self-adaptive anomaly detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110016872.7A CN112733446A (en) 2021-01-07 2021-01-07 Data-driven self-adaptive anomaly detection method

Publications (1)

Publication Number Publication Date
CN112733446A true CN112733446A (en) 2021-04-30

Family

ID=75590812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110016872.7A Pending CN112733446A (en) 2021-01-07 2021-01-07 Data-driven self-adaptive anomaly detection method

Country Status (1)

Country Link
CN (1) CN112733446A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822344A (en) * 2021-08-30 2021-12-21 中能电力科技开发有限公司 Wind turbine generator front bearing state monitoring method based on data driving
CN114296009A (en) * 2022-03-10 2022-04-08 山东汇能电气有限公司 Intelligent analysis system for transformer operation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046502A (en) * 2005-06-10 2007-10-03 清华大学 Cable running safety evaluating method
US20090043447A1 (en) * 2007-08-07 2009-02-12 General Electric Company Systems and Methods for Model-Based Sensor Fault Detection and Isolation
CN109738776A (en) * 2019-01-02 2019-05-10 华南理工大学 Fan converter open-circuit fault recognition methods based on LSTM
CN110362048A (en) * 2019-07-12 2019-10-22 上海交通大学 Blower critical component state monitoring method and device, storage medium and terminal
CN110569925A (en) * 2019-09-18 2019-12-13 南京领智数据科技有限公司 LSTM-based time sequence abnormity detection method applied to electric power equipment operation detection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101046502A (en) * 2005-06-10 2007-10-03 清华大学 Cable running safety evaluating method
US20090043447A1 (en) * 2007-08-07 2009-02-12 General Electric Company Systems and Methods for Model-Based Sensor Fault Detection and Isolation
CN109738776A (en) * 2019-01-02 2019-05-10 华南理工大学 Fan converter open-circuit fault recognition methods based on LSTM
CN110362048A (en) * 2019-07-12 2019-10-22 上海交通大学 Blower critical component state monitoring method and device, storage medium and terminal
CN110569925A (en) * 2019-09-18 2019-12-13 南京领智数据科技有限公司 LSTM-based time sequence abnormity detection method applied to electric power equipment operation detection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨繁荣: "《大学物理实验》", 31 January 2016, 西南交通大学出版社 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113822344A (en) * 2021-08-30 2021-12-21 中能电力科技开发有限公司 Wind turbine generator front bearing state monitoring method based on data driving
CN113822344B (en) * 2021-08-30 2024-05-31 龙源(北京)新能源工程技术有限公司 Method for monitoring state of front bearing of generator of wind turbine generator based on data driving
CN114296009A (en) * 2022-03-10 2022-04-08 山东汇能电气有限公司 Intelligent analysis system for transformer operation
CN114296009B (en) * 2022-03-10 2022-05-24 山东汇能电气有限公司 Intelligent analysis system for transformer operation

Similar Documents

Publication Publication Date Title
CN108829933B (en) Method for predictive maintenance and health management of semiconductor manufacturing equipment
CN111353482B (en) LSTM-based fatigue factor recessive anomaly detection and fault diagnosis method
Shi et al. Rolling bearing initial fault detection using long short-term memory recurrent network
CN109298697B (en) Method for evaluating working state of each part of thermal power plant system based on dynamic baseline model
Niu et al. Intelligent condition monitoring and prognostics system based on data-fusion strategy
Tobon-Mejia et al. Hidden Markov models for failure diagnostic and prognostic
US20070239629A1 (en) Cluster Trending Method for Abnormal Events Detection
JP2004531815A (en) Diagnostic system and method for predictive condition monitoring
Shang et al. Early classification of alarm floods via exponentially attenuated component analysis
CN111538311B (en) Flexible multi-state self-adaptive early warning method and device for mechanical equipment based on data mining
CN112414694B (en) Equipment multistage abnormal state identification method and device based on multivariate state estimation technology
Mathew et al. Regression kernel for prognostics with support vector machines
KR102005138B1 (en) Device abnormality presensing method and system using thereof
CN112733446A (en) Data-driven self-adaptive anomaly detection method
Zhao et al. Fault prognosis of wind turbine generator using SCADA data
CN111964909A (en) Rolling bearing operation state detection method, fault diagnosis method and system
CN117932322B (en) Flour equipment fault diagnosis method and system
KR20200010671A (en) System and method for fault diagnosis of equipment based on machine learning
CN113574480A (en) Apparatus for predicting equipment damage
CN116738333A (en) Electrical signal multi-classification and prediction method for naive Bayes of small sample of aircraft
CN112016193B (en) Online prediction method and system for lubrication failure of shield tunneling machine system
CN117630797A (en) Ammeter health state detection method, system and storage medium based on working current
CN117471346A (en) Method and system for determining remaining life and health status of retired battery module
CN116907772A (en) Self-diagnosis and fault source identification method and system of bridge structure monitoring sensor
CN114818116A (en) Aircraft engine failure mode identification and service life prediction method based on joint learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210430