CN110008253A - The industrial data association rule mining and unusual service condition prediction technique of strategy are generated based on two stages frequent item set - Google Patents

The industrial data association rule mining and unusual service condition prediction technique of strategy are generated based on two stages frequent item set Download PDF

Info

Publication number
CN110008253A
CN110008253A CN201910244856.6A CN201910244856A CN110008253A CN 110008253 A CN110008253 A CN 110008253A CN 201910244856 A CN201910244856 A CN 201910244856A CN 110008253 A CN110008253 A CN 110008253A
Authority
CN
China
Prior art keywords
line segment
frequent
data
service condition
association rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910244856.6A
Other languages
Chinese (zh)
Other versions
CN110008253B (en
Inventor
徐正国
王豆
陈积明
程鹏
孙优贤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN201910244856.6A priority Critical patent/CN110008253B/en
Publication of CN110008253A publication Critical patent/CN110008253A/en
Application granted granted Critical
Publication of CN110008253B publication Critical patent/CN110008253B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of industrial data association rule minings and unusual service condition prediction technique that strategy is generated based on two stages frequent item set, can be applied to the prognostic and health management of industrial process.The present invention is introduced into association rule mining in industrial equipment failure predication, finds the relevance between operating parameter by association rules mining algorithm.For industrial data feature, start with from the variation tendency of equipment operating parameter, by being most important quota student into transaction set with operating parameter variation tendency, and the association rule mining between parameter and parameter is carried out based on this, then association rule mining result is introduced into the prediction of industrial equipment unusual service condition, to obtain more accurate prediction result.For in engineering failure predication and health control have major application value.

Description

The industrial data association rule mining and different of strategy is generated based on two stages frequent item set Normal operating condition prediction technique
Technical field
The invention belongs to reliability maintenance field of engineering technology, it is related to a kind of generating strategy based on two stages frequent item set Industrial data association rule mining and unusual service condition prediction technique.
Background technique
As the emergence of complication system and the demand of industrial process real-time monitoring are continuously increased, modern industrial equipment Multiple sensors are often equipped in the process of running to be monitored its operating status.Meanwhile it may in equipment running process There is various faults mode, a certain failure may correspond to several signs, and in the case, single-sensor information can not complete body Existing equipment running status, the failure predication based on multi-sensor information are come into being.Failure predication based on multi-sensor information It is intended to the operating status using comprehensive sensor information analytical equipment, to carry out more reliable device diagnostic and prediction.With The sustainable development of sensing technology, using multiple sensors carry out equipment status monitoring, fault diagnosis and prediction have become Development trend.
There are certain relevances between its operating parameter in equipment running process still rarely has at present in failure predication field The work that association rule mining is combined with failure predication.And in fact, for time series data, equipment fault or failure The correlated characteristic extracted often by parameter or from parameter is embodied, and prediction is often to parameter or correlated characteristic Variation tendency predicted.Excavate the correlation rule between parameter, available more complete parameter, that is, equipment running status letter Breath provides certain foundation for subsequent prediction.
Summary of the invention
For the status of the prior art, present invention aim to address rarely have consideration in the Predicting Technique of available data driving Sensing data the case where there are correlation rules, propose a kind of unit exception operating condition prediction side based on operating parameter correlation rule Method constructs more applicable wavelet neural network and carries out unusual service condition prediction (failure predication).
Now design of the invention is described below:
The present invention is portrayed using relevance of the correlation rule to industrial process operating parameter, have studied based on when ordinal number According to the unusual service condition forecasting problem of association rule mining.In order to which association rule are excavated in sequence level for time series data Then, the invention proposes a kind of time series data association rules mining algorithms that process is generated comprising two stages frequent item set. In the first stage, basic model of the variation tendency information of extraction time sequence as association rule mining, discovery time sequence The frequent item set of change shape;In second stage, based on the frequent item set of time series variation form, discovery sequence is base The frequent item set of this mode, and association rule mining has been carried out to sequence two-by-two.Then, the resulting correlation rule phase of excavation is utilized The system variable of pass carries out unusual service condition prediction, and correlation rule is introduced into wavelet neural network and improves forecasting accuracy.This hair The method of bright proposition accounts for operating parameter correlation rule, can obtain more accurate failure predication result.
According to the above inventive concept, the invention proposes a kind of industrial datas that strategy is generated based on two stages frequent item set Association rule mining and prediction technique, the specific steps are as follows:
Step 1: to time series data piece-wise linearization expression and symbolism, construction be suitable for association rule mining from Dissipate type data set;
Step 2: the frequent item set of data set is generated using two stage Frequent Itemsets Mining Algorithm;
Step 3: correlation rule being generated according to frequent item set, extracts the pass for meeting minimum support and minimal confidence threshold Connection rule;
Step 4: association rule mining result being introduced into wavelet neural network, and the unusual service condition for industrial equipment is pre- It surveys.
Based on above scheme, each step can specifically use following implementation:
Preferably, the step 1 includes following sub-step:
Step 1.1: note sensor measurement time series isN is sensor Quantity, k length of time series;Initially fitting starting point isInitially fitting terminal isNote is fitted starting pointBeing fitted terminal isError of fitting threshold value is ωE
Step 1.2: for eachPiecewise fitting is carried out as follows:
1.2.1 waypoint count value count=1 is initialized;
1.2.2 successively to each fitting starting pointExecute step 1)-step 4):
1) end=start+h is calculated first;
2) for dataAnd be fitted using least square method, it counts Calculate error of fitting ERR;
3) if error of fitting ERR is not more than error of fitting threshold value ωE, then 1) h=h+1, gos to step again;
4) if error of fitting ERR is greater than error of fitting threshold value ωE, obtainLine segment be fitted sequenceStart=start+h records waypointReset h=2, count=count+1;
1.2.3 circulation executes 1.2.2 and terminates greater than k until end, the line segment time series after being fittedAnd waypointThe segmentation point sequence P of compositioni
Step 1.3: the time series after any sensor is fittedIt is denoted as Yk={ y1,y2,…,yk, it extracts every and intends The trend and numerical information of zygonema section, and a matching line segment s is indicated by the way of following triplei:
Wherein, kiIndicate the slope of the line segment,Indicate the span of the line segment on a timeline, riIndicate the segment data Growth rate, data { y corresponding for the line segmentj,yj+1,…,yj+h,J is the starting point of the line segment;
To line segment time series YkIn all line segments carry out triple expression, obtain triad sequence Sn={ s1, s2,…,sn, wherein n indicates time series XkLine segment number after segmentation;
Step 1.4: cluster being carried out to a serial of line sections in triad sequence and symbolism is carried out to line segment, is set for indicating Standby or different system version describes line segment s using Euclidean distanceiAnd sjSimilarity dij:
Wherein, dijIndicate line segment siAnd sjSimilarity, dijIt is smaller, then it represents that two lines section has more like variation shape State, ωkAnd ωrFor weight;
Then according to index of similarity dij, using K-means clustering algorithm to SnIt is clustered, and is same class line segment point The variation pattern that operating parameter is indicated with a same symbol, obtains the sequence F of symbolismn={ f1,f2,…,fn, f1, f2,…,fnRespectively indicate the 1,2nd ..., the symbol that n line segment is assigned to;
Step 1.5: for the time of measuring sequence of every two sensorWithMerge it and is segmented point sequence PiAnd Pj, It is denoted as Pij, nij- 1 is PiAnd PjWaypoint number after merging;And by the waypoint after merging to its symbolism sequenceWith It is split reconstruct, the symbolism sequence after being reconstructedWith
Preferably, the step 2 includes following sub-step:
Step 2.1: for time of measuring sequenceWithCorresponding operating parameter ViAnd Vj, it is obtained by step 1 Measurement sequence symbolism data beWithTransaction set is constituted by it, I.e. each affairs are denoted as WithIncluded in line segment class code be denoted as respectivelyWithRemember that two stage minimum support threshold value is respectively minsup1And minsup2
Step 2.2: by single sweep operation data set, calculating the support of each single item, obtain frequent 1- item collection, by as follows 2.2.1~2.2.3 process carries out:
2.2.1: note σ () is the support counting of item or item collection, is initially 0;IfClass code be tk, t expression A or b;
2.2.2: for each affairsCalculate σ (tk)=σ (tk)+1;
2.2.3: for each tkIfNot less than minimum support threshold value minsup1, then it is assumed that tkFor frequent 1- Item collection retains tkAnd record corresponding support counting;IfLess than minimum support threshold value minsup1, then it is assumed that tk It is not frequent 1- item collection;
Step 2.3: using the frequent 1- item collection t obtained in step 2.2k2- item collection is constituted, and calculates its support, to It was found that frequent 2- item collection, carries out according to the following procedure:
2.3.1: note apAnd bqRespectively pass through step 2.2 from former line segment class codeWithThe item of middle reservation;
2.3.2 for each { ap,bq, execute following steps:
1) each is present inIn { ap,bq, calculate σ ({ ap,bq)=σ ({ ap,bq})+1
If 2)Not less than minsup1, then it is assumed that { ap,bqIt is frequent 2- item collection, retain { ap,bqAnd record Corresponding support counting;
Step 2.4: using the frequent 2- item collection { a obtained in step 2.3p,bqCalculate every two operating parameter entirely counting According to the support of concentration, and the frequent item set of parameter level is obtained, carried out according to the following procedure: to every two operating parameter ViAnd Vj Item collection { the V of compositioni,Vj, calculate σ ({ Vi,Vj)=sum (σ ({ ap,bq)), ifNot less than minimum support threshold value minsup2, then retain { Vi,VjAnd corresponding support is recorded, and calculate σ (Vi)=sum (σ (ap));σ(Vj)=sum (σ (bq))。
Preferably, the step 3 includes following sub-step:
Step 3.1: to the every group of { V for meeting support threshold obtained in step 2i,Vj, generate following correlation rule: Vj →ViAnd Vi→Vj, note minimal confidence threshold is minconf;
Step 3.2: according to every group of correlation rule of generation, calculating its confidence threshold value, extract the process of correlation rule such as Under: for each correlation rule Vi→Vj, calculateIf conf (Vi→Vj) be not less than Minimal confidence threshold is minconf, then retains correlation rule Vi→VjAnd record corresponding support and confidence level ωi
Preferably, the step 4 includes following sub-step:
Step 4.1: for any group association parameter extracted from correlation rule, being denoted as { V1,V2,…,Vu, wherein u table Show the quantity of relevant parameter, VuFor rule it is consequent i.e. target component, for each correlation rule Vi→Vu, i=1,2 ... u- 1, there is a confidence level, is denoted as ωi;For target component Vu, using wavelet neural network, carry out unusual service condition prediction;
Step 4.2: construction training sample: remembering that preset prediction step is l, the one group of association extracted by association rule mining Parameter is { V1,V2,…,Vu, the complete training dataset being made of it is denoted as Construct following matrix ItrainIt is inputted for the training of neural network:
Wherein, ItrainIn it is each be classified as a trained input sample, construct training output OtrainAre as follows:
Step 4.3: be trained using the training sample of construction to wavelet neural network: input parameter is Vi, i=1, 2 ... u-1, output parameter Vu, wherein utilizing the confidence level ω obtained by correlation rule in netiniti, i=1, The initial weight between network input layer and hidden layer is arranged in 2 ... u-1;
Step 4.4: new data prediction: remembering that threshold value occurs for preset unusual service condition is ωp, for new collected sensor Measurement data carries out l step prediction using model trained in step 4.3, if obtained target component predicted value is relative to first The normal drift value that begins is more than set threshold value ωp, then it is assumed that unusual service condition occurs.
Preferably, before equipment does not fail, with the update of data, after every measurement data for updating predetermined quantity, Simultaneously training pattern need to be reconfigured, to obtain more accurate prediction result.
It is proposed by the present invention a kind of the industrial data association rule mining of strategy and pre- to be generated based on two stages frequent item set Survey method can be used for the Complex Industrial Systems of sensor measurement.By being associated rule digging to industrial equipment operating parameter, Corresponding parameter association is obtained, and is introduced into wavelet neural network prediction, more accurate prediction effect can be obtained. This will provide solid support to subsequent plant maintenance plan, for the equipment maintenance and management stringent to reliability requirement It is of great advantage, there are bright prospects in terms of practical implementation.
Detailed description of the invention
Fig. 1 compares for 7 prediction result of IDV (13) variable in embodiment and with true value;
Fig. 2 compares for 11 prediction result of IDV (13) variable in embodiment and with true value;
Fig. 3 is that IDV (13) variable 7 predicts error rate in embodiment;
Fig. 4 is that IDV (13) variable 11 predicts error rate in embodiment.
Specific embodiment
A specific embodiment of the invention is further described now in conjunction with attached drawing.
This example is specifically described concrete operation step by Tennessee-Yi Siman (TE) process simulation data and tests below The effect of card method.
The sampling interval of the data set is 3 minutes, each collected change of sensor under the data set record sampling interval Measurement.Under each service condition (the failure operation state under normal operating condition and 21 kinds of preset failures), imitate The measurement data of true process will all generate two class data sets, i.e. training set and test set.Wherein, for the acquisition of training set Journey is the measured value of all 52 variables obtained in the case of simulation process runs 25 small, wherein except normal operation Outside the training set that state acquisition arrives, the acquisition of remaining 21 training set data is as a child to introduce failure in simulation process operation 1, And only record the measurement data of subsequent 24 hours.In other words, the training set of normal operating condition has 500 observation samples, The training set acquired under remaining 21 malfunction is 480 observation samples.In addition, for 22 test sets, data are Simulation process runs 48 collected all variable measurements of hour institute, that is to say, that includes 960 in each test set Sample data.It should be noted that corresponding failure is at simulation run 8 hours when emulating to 21 kinds of procedure faults It introduces afterwards.Therefore, for the test set under 21 failure operation states, preceding 160 observation samples are normal data, after 800 observation samples are fault data.In TE process simulation model, only IDV (13) is a soft fault, therefore, In this example, we are tested using the related data of IDV (13).Industrial data association rule mining and unusual service condition prediction side Detailed process is as follows for method:
Step 1: to time series data piece-wise linearization expression and symbolism, construction be suitable for association rule mining from Dissipate type data set.This step specifically includes following sub-step:
Step 1.1: note sensor measurement time series isN is sensor Quantity, k length of time series;Initially fitting starting point isInitially fitting terminal isNote is fitted starting pointBeing fitted terminal isError of fitting threshold value is ωE.It should be noted that in the present invention, i, j are tables as subscript The number for showing sensor is only to indicate ordinal number as subscript, unrelated with sensor number.
Step 1.2: for eachPiecewise fitting is carried out as follows:
1.2.1 waypoint count value count=1 is initialized;
1.2.2 successively to each fitting starting pointExecute step 1)-step 4):
1) end=start+h is calculated first;
2) for dataAnd be fitted using least square method, it counts Calculate error of fitting ERR;
3) if error of fitting ERR is not more than error of fitting threshold value ωE, then 1) h=h+1, gos to step again;
4) if error of fitting ERR is greater than error of fitting threshold value ωE, obtainLine segment be fitted sequenceRecord waypointReset h=2, count=count+1;
1.2.3 circulation executes 1.2.2 and terminates greater than k until end, the line segment time sequence after obtaining least square method fitting ColumnAnd waypointThe segmentation point sequence P of compositioni
Step 1.3: the time series after any sensor is fittedIt is denoted as Yk={ y1,y2,…,yk, wherein having more The line segment of the aforementioned least square method fitting of item.The trend and numerical information of every matching line segment are extracted, and uses following ternary The mode of group indicates a matching line segment si:
Wherein, kiIndicate the slope of the line segment,Indicate the span of the line segment on a timeline, riIndicate the segment data Growth rate, data { y corresponding for the line segmentj,yj+1,…,yj+h,J is the starting point of the line segment;
To line segment time series YkIn all line segments carry out triple expression, obtain triad sequence Sn={ s1, s2,…,sn, wherein n indicates time series XkLine segment number after segmentation;
Step 1.4: cluster being carried out to a serial of line sections in triad sequence and symbolism is carried out to line segment, is set for indicating Standby or different system version, to prepare for subsequent association rule mining.Line segment s is described using Euclidean distancei And sjSimilarity dij:
Wherein, dijIndicate line segment siAnd sjSimilarity, dijIt is smaller, then it represents that two lines section has more like variation shape State, ωkAnd ωrFor weight;
Then according to index of similarity dij, using K-means clustering algorithm to SnIt is clustered, and is same class line segment point The variation pattern that operating parameter is indicated with a same symbol, obtains the sequence F of symbolismn={ f1,f2,…,fn, f1, f2,…,fnRespectively indicate the 1,2nd ..., the symbol that n line segment is assigned to;
Step 1.5: for the time of measuring sequence of every two sensorWithMerge it and is segmented point sequence PiAnd Pj, It is denoted as Pij, nij- 1 is PiAnd PjWaypoint number after merging;And by the waypoint after merging respectively to its symbolism sequence WithIt is split reconstruct, the symbolism sequence after being reconstructedWith
Step 2: the frequent item set of data set is generated using two stage Frequent Itemsets Mining Algorithm.This step specifically includes Following sub-step:
Step 2.1: for time of measuring sequenceWithCorresponding operating parameter ViAnd Vj, it is obtained by step 1 Measurement sequence symbolism data beWithTransaction set is constituted by it, I.e. each affairs are denoted as WithIncluded in line segment class code be denoted as respectivelyWithRemember that two stage minimum support threshold value is respectively minsup1And minsup2.? In this example, minimum support threshold value is set are as follows: minsup1=0.2, minsup2=0.2.
Step 2.2: by single sweep operation data set, calculating the support of each single item, obtain frequent 1- item collection, by as follows 2.2.1~2.2.3 process carries out:
2.2.1: note σ () is the support counting of item or item collection, is initially 0;IfClass code be tk, t expression A or b;
2.2.2: for each affairsCalculate σ (tk)=σ (tk)+1;
2.2.3: for each tkIfNot less than minimum support threshold value minsup1, then it is assumed that tkIt is frequent 1- item collection retains tkAnd record corresponding support counting;IfLess than minimum support threshold value minsup1, then it is assumed that tkIt is not frequent 1- item collection;
Step 2.3: using the frequent 1- item collection t obtained in step 2.2k2- item collection is constituted, and calculates its support, to It was found that frequent 2- item collection, carries out according to the following procedure:
2.3.1: note apAnd bqRespectively pass through step 2.2 from former line segment class codeWithThe item of middle reservation;
2.3.2 for each { ap,bq, execute following steps:
1) each is present inIn { ap,bq, calculate σ ({ ap,bq)=σ ({ ap,bq})+1
If 2)Not less than minsup1, then it is assumed that { ap,bqIt is frequent 2- item collection, retain { ap,bqAnd record Corresponding support counting;
Step 2.4: using the frequent 2- item collection { a obtained in step 2.3p,bqCalculate every two operating parameter entirely counting According to the support of concentration, and the frequent item set of parameter level is obtained, carried out according to the following procedure: to every two operating parameter ViAnd Vj Item collection { the V of compositioni,Vj, calculate σ ({ Vi,Vj)=sum (σ ({ ap,bq)), ifNot less than minimum support threshold value minsup2, then retain { Vi,VjAnd corresponding support is recorded, and calculate σ (Vi)=sum (σ (ap));σ(Vj)=sum (σ (bq))。
Step 3: correlation rule being generated according to frequent item set, extracts the pass for meeting minimum support and minimal confidence threshold Connection rule.This step specifically includes following sub-step:
Step 3.1: to the every group of { V for meeting support threshold obtained in step 2i,Vj, generate following correlation rule: Vj →ViAnd Vi→Vj, note minimal confidence threshold is minconf;In this example, in this example, minimal confidence threshold is set are as follows: Minconf=0.7;
Step 3.2: according to every group of correlation rule of generation, calculating its confidence threshold value, extract the process of correlation rule such as Under: for each correlation rule Vi→Vj, calculateIf conf (Vi→Vj) be not less than Minimal confidence threshold is minconf, then retains correlation rule Vi→VjAnd record corresponding support and confidence level ωi
This step generates the correlation rule for meeting threshold condition, and extracts part relevant parameter and its confidence value such as table 1 It is shown.As seen from the results in Table 1, this example will use variable 7 and variable 11 to carry out predicted operation as target component.
Step 4: association rule mining result being introduced into wavelet neural network, and the unusual service condition for industrial equipment is pre- It surveys.This step specifically includes following sub-step:
Step 4.1: for any group association parameter extracted from correlation rule, being denoted as { V1,V2,…,Vu, wherein u table Show the quantity of relevant parameter, VuFor rule it is consequent i.e. target component, for each correlation rule Vi→Vu, i=1,2 ... u- 1, there is a confidence level, is denoted as ωi;For target component Vu, using wavelet neural network, carry out unusual service condition prediction;
Step 4.2: construction training sample: remembering that preset prediction step is l, in this example, prediction step is set as 10.By The group association parameter that association rule mining extracts is { V1,V2,…,Vu, the complete training dataset being made of it is denoted asConstruct following matrix ItrainIt is inputted for the training of neural network:
Wherein, ItrainIn it is each be classified as a trained input sample, construct training output OtrainAre as follows:
Specifically, training set herein not merely uses the fault data of IDV (13) correlated variables, while also using Data under correlated variables normal operating condition.
Step 4.3: be trained using the training sample of construction to wavelet neural network: input parameter is Vi, i=1, 2 ... u-1, output parameter Vu, wherein utilizing the confidence level ω obtained by correlation rule in netiniti, i=1, The initial weight between network input layer and hidden layer is arranged in 2 ... u-1.In this example, for variable 7, input layer is 4 sections Point, hidden layer are 8 nodes;For variable 11, input layer is 3 nodes, and hidden layer is 6 nodes, the output of two variables Layer is 1 node, wherein the wavelet basis function used is Morlet morther wavelet basic function, and with the related confidence in table 1 Initialization weight of the angle value as neural network input layer and hidden layer;
Step 4.4: new data prediction: remembering that threshold value occurs for preset unusual service condition is ωp, for new collected sensor Measurement data carries out l step prediction using model trained in step 4.3, if obtained target component predicted value is relative to first The normal drift value that begins is more than set threshold value ωp, then it is assumed that unusual service condition occurs.Before equipment does not fail, with data It updates, every update predetermined quantity NlMeasurement data after, need to reconfigure and training pattern, to obtain more accurately predicting knot Fruit, wherein NlDepending on sensor sample frequency and actual industrial field demand.This example utilizes test set (totally 960 samplings Point) preceding 300 data to verify prediction effect, and neural network is updated according to every 10 data.It is arranged herein different The threshold value that (i.e. its normal value certain percentage of parameter drift-out) occurs for normal operating condition (failure) is ωp=0.015.
1 correlation rule of table
Regular preceding paragraph Rule is consequent Confidence level
Variable 13 Variable 7 0.7527
Variable 16 Variable 7 0.7446
Variable 36 Variable 7 0.7017
Variable 35 Variable 11 0.7513
Variable 36 Variable 11 0.7390
Table 2 always predicts error rate
Introduce correlation rule It is not introduced into correlation rule
Variable 7 1.0482 1.8548
Variable 11 0.8536 1.2135
Fig. 1 and Fig. 2 is 13 prediction result of variable 7 and variable, and the advantage of correlation rule is introduced for verifying, and this example ties prediction Fruit is compared with the neural network prediction result being not introduced under conditions of correlation rule.In fig. 1 and 2, perpendicular solid line is The actual unusual service condition time of origin under our threshold value setting, erect dotted line and perpendicular chain-dotted line be respectively introduce correlation rule and It is not introduced into the predicted value of abnormal time of origin under the premise of correlation rule.By Fig. 1 and Fig. 2 it is found that the method institute that the present invention is mentioned Obtained prediction result can preferably approaching to reality value obtain very well especially in the prediction of first half test data Prediction result, this is because first half is the operation data under normal condition, training set is more complete and numerical value is relatively concentrated. In the prediction of out-of-service time, the method that is mentioned of the present invention has also obtained preferable prediction result, in Fig. 1, predicted value and true Real value is compared to 8 sampled points have been lagged, and in Fig. 2, predicted value has then lagged 5 sampled points compared with true value.Be not introduced into The prediction result of correlation rule is compared, and the mentioned method of the present invention obviously obtains more accurate prediction result.Variable 7 and variable 11 Prediction error rate calculated result it is as shown in Figure 3 and Figure 4.Meanwhile being further quantized result, this example also calculates total prediction and misses Rate, as shown in table 2.From the point of view of integrally predicting error, the introducing of correlation rule significantly reduces the prediction error of neural network, This point has also obtained good embodiment in the data that chart 2 is presented.

Claims (6)

1. a kind of industrial data association rule mining and unusual service condition prediction technique that strategy is generated based on two stages frequent item set, It is characterized in that, specific step is as follows:
Step 1: to time series data piece-wise linearization expression and symbolism, construction is suitable for the discrete type of association rule mining Data set;
Step 2: the frequent item set of data set is generated using two stage Frequent Itemsets Mining Algorithm;
Step 3: correlation rule being generated according to frequent item set, extracts the association rule for meeting minimum support and minimal confidence threshold Then;
Step 4: association rule mining result being introduced into wavelet neural network, and is predicted for the unusual service condition of industrial equipment.
2. a kind of industrial data association rule mining for generating strategy based on two stages frequent item set according to claim 1 And unusual service condition prediction technique, it is characterised in that the step 1 includes following sub-step:
Step 1.1: note sensor measurement time series isN is number of sensors, K length of time series;Initially fitting starting point isInitially fitting terminal isNote is fitted starting pointIt is quasi- Closing terminal isError of fitting threshold value is ωE
Step 1.2: for eachPiecewise fitting is carried out as follows:
1.2.1 waypoint count value count=1 is initialized;
1.2.2 successively to each fitting starting pointExecute step 1)-step 4):
1) end=start+h is calculated first;
2) for dataAnd be fitted using least square method, it calculates quasi- Close error E RR;
3) if error of fitting ERR is not more than error of fitting threshold value ωE, then 1) h=h+1, gos to step again;
4) if error of fitting ERR is greater than error of fitting threshold value ωE, obtainLine segment be fitted sequence Start=start+h records waypointReset h=2, count=count+1;
1.2.3 circulation executes 1.2.2 and terminates greater than k until end, the line segment time series after being fittedAnd waypointThe segmentation point sequence P of compositioni
Step 1.3: the time series after any sensor is fittedIt is denoted as Yk={ y1,y2,…,yk, extract every fit line The trend and numerical information of section, and a matching line segment s is indicated by the way of following triplei:
Wherein, kiIndicate the slope of the line segment,Indicate the span of the line segment on a timeline, riIndicate the growth of the segment data Rate, data { y corresponding for the line segmentj,yj+1,…,yj+h,J is the starting point of the line segment;
To line segment time series YkIn all line segments carry out triple expression, obtain triad sequence Sn={ s1,s2,…, sn, wherein n indicates time series XkLine segment number after segmentation;
Step 1.4: to a serial of line sections in triad sequence carry out cluster and to line segment carry out symbolism, for indicate equipment or The different version of system describes line segment s using Euclidean distanceiAnd sjSimilarity dij:
Wherein, dijIndicate line segment siAnd sjSimilarity, dijIt is smaller, then it represents that two lines section has more like change shape, ωk And ωrFor weight;
Then according to index of similarity dij, using K-means clustering algorithm to SnIt is clustered, and is same class line segment distribution one A the same symbol obtains the sequence F of symbolism to indicate the variation pattern of operating parametern={ f1,f2,…,fn, f1,f2,…, fnRespectively indicate the 1,2nd ..., the symbol that n line segment is assigned to;
Step 1.5: for the time of measuring sequence of every two sensorWithMerge it and is segmented point sequence PiAnd Pj, it is denoted as Pij, nij- 1 is PiAnd PjWaypoint number after merging;And by the waypoint after merging to its symbolism sequenceWithIt carries out Segmentation reconstruct, the symbolism sequence after being reconstructedWith
3. a kind of industrial data association rule mining for generating strategy based on two stages frequent item set according to claim 2 And unusual service condition prediction technique, it is characterised in that the step 2 includes following sub-step:
Step 2.1: for time of measuring sequenceWithCorresponding operating parameter ViAnd Vj, it is obtained by step 1 and measures sequence The symbolism data of column areWithTransaction set is made of it, i.e., it is each Affairs are denoted as WithIncluded in line segment class code be denoted as respectivelyWithRemember that two stage minimum support threshold value is respectively minsup1And minsup2
Step 2.2: by single sweep operation data set, calculating the support of each single item, frequent 1- item collection is obtained, by following 2.2.1 ~2.2.3 process carries out:
2.2.1: note σ () is the support counting of item or item collection, is initially 0;IfClass code be tk, t expression a or b;
2.2.2: for each affairsCalculate σ (tk)=σ (tk)+1;
2.2.3: for each tkIfNot less than minimum support threshold value minsup1, then it is assumed that tkIt is 1- frequent Collection retains tkAnd record corresponding support counting;IfLess than minimum support threshold value minsup1, then it is assumed that tkIt is not Frequent 1- item collection;
Step 2.3: using the frequent 1- item collection t obtained in step 2.2k2- item collection is constituted, and calculates its support, to find Frequent 2- item collection, carries out according to the following procedure:
2.3.1: note apAnd bqRespectively pass through step 2.2 from former line segment class codeWith The item of middle reservation;
2.3.2 for each { ap,bq, execute following steps:
1) each is present inIn { ap,bq, calculate σ ({ ap,bq)=σ ({ ap,bq})+1
If 2)Not less than minsup1, then it is assumed that { ap,bqIt is frequent 2- item collection, retain { ap,bqAnd record corresponding Support counting;
Step 2.4: using the frequent 2- item collection { a obtained in step 2.3p,bqEvery two operating parameter is calculated in entire data set In support, and obtain the frequent item set of parameter level, carry out according to the following procedure: to every two operating parameter ViAnd VjIt constitutes Item collection { Vi,Vj, calculate σ ({ Vi,Vj)=sum (σ ({ ap,bq)), ifNot less than minimum support threshold value minsup2, then retain { Vi,VjAnd corresponding support is recorded, and calculate σ (Vi)=sum (σ (ap));σ(Vj)=sum (σ (bq))。
4. a kind of industrial data association rule mining for generating strategy based on two stages frequent item set according to claim 3 And unusual service condition prediction technique, it is characterised in that the step 3 includes following sub-step:
Step 3.1: to the every group of { V for meeting support threshold obtained in step 2i,Vj, generate following correlation rule: Vj→Vi And Vi→Vj, note minimal confidence threshold is minconf;
Step 3.2: according to every group of correlation rule of generation, calculating its confidence threshold value, the process for extracting correlation rule is as follows: right In each correlation rule Vi→Vj, calculateIf conf (Vi→Vj) set not less than minimum Confidence threshold is minconf, then retains correlation rule Vi→VjAnd record corresponding support and confidence level ωi
5. a kind of industrial data association rule mining for generating strategy based on two stages frequent item set according to claim 4 And unusual service condition prediction technique, it is characterised in that the step 4 includes following sub-step:
Step 4.1: for any group association parameter extracted from correlation rule, being denoted as { V1,V2,…,Vu, wherein u indicates to close Join the quantity of parameter, VuFor rule it is consequent i.e. target component, for each correlation rule Vi→Vu, i=1,2 ... u-1 have One confidence level, is denoted as ωi;For target component Vu, using wavelet neural network, carry out unusual service condition prediction;
Step 4.2: construction training sample: remembering that preset prediction step is l, the group association parameter extracted by association rule mining For { V1,V2,…,Vu, the complete training dataset being made of it is denoted asConstruction Following matrix ItrainIt is inputted for the training of neural network:
Wherein, ItrainIn it is each be classified as a trained input sample, construct training output OtrainAre as follows:
Step 4.3: be trained using the training sample of construction to wavelet neural network: input parameter is Vi, i=1,2 ... u- 1, output parameter Vu, wherein utilizing the confidence level ω obtained by correlation rule in netiniti, i=1,2 ... u-1, Initial weight between network input layer and hidden layer is set;
Step 4.4: new data prediction: remembering that threshold value occurs for preset unusual service condition is ωp, for new collected sensor measurement Data carry out l step prediction using model trained in step 4.3, if obtained target component predicted value is relative to initially just Normal drift value is more than set threshold value ωp, then it is assumed that unusual service condition occurs.
6. a kind of industrial data association rule mining for generating strategy based on two stages frequent item set according to claim 1 And unusual service condition prediction technique, it is characterised in that before equipment does not fail, with the update of data, every update predetermined quantity After measurement data, simultaneously training pattern need to be reconfigured, to obtain more accurate prediction result.
CN201910244856.6A 2019-03-28 2019-03-28 Industrial data association rule mining and abnormal working condition prediction method Active CN110008253B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910244856.6A CN110008253B (en) 2019-03-28 2019-03-28 Industrial data association rule mining and abnormal working condition prediction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910244856.6A CN110008253B (en) 2019-03-28 2019-03-28 Industrial data association rule mining and abnormal working condition prediction method

Publications (2)

Publication Number Publication Date
CN110008253A true CN110008253A (en) 2019-07-12
CN110008253B CN110008253B (en) 2021-02-23

Family

ID=67168723

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910244856.6A Active CN110008253B (en) 2019-03-28 2019-03-28 Industrial data association rule mining and abnormal working condition prediction method

Country Status (1)

Country Link
CN (1) CN110008253B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112130541A (en) * 2020-10-20 2020-12-25 陕西煤业新型能源科技股份有限公司 Energy comprehensive management control system based on Internet of things
CN112380274A (en) * 2020-11-16 2021-02-19 北京航空航天大学 Control process-oriented anomaly detection system
CN112801426A (en) * 2021-04-06 2021-05-14 浙江浙能技术研究院有限公司 Industrial process fault fusion prediction method based on correlation parameter mining
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method
CN113032912A (en) * 2021-04-20 2021-06-25 上海交通大学 Ship diesel engine fault detection method based on association rule
CN113792754A (en) * 2021-08-12 2021-12-14 国网江西省电力有限公司电力科学研究院 Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN114936581A (en) * 2022-06-01 2022-08-23 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN115497267A (en) * 2022-09-06 2022-12-20 江西小手软件技术有限公司 Equipment early warning platform based on time sequence association rule
CN115689071A (en) * 2023-01-03 2023-02-03 南京工大金泓能源科技有限公司 Equipment fault fusion prediction method and system based on correlation parameter mining
CN116204842A (en) * 2023-03-10 2023-06-02 广东省建设工程质量安全检测总站有限公司 Abnormality monitoring method and system for electrical equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201898519U (en) * 2010-09-01 2011-07-13 燕山大学 Equipment maintenance early-warning device with risk control
US20120239600A1 (en) * 2009-12-21 2012-09-20 International Business Machines Corporation Method for training and using a classification model with association rule models
CN103676645A (en) * 2013-12-11 2014-03-26 广东电网公司电力科学研究院 Mining method for association rules in time series data flows
CN108873859A (en) * 2018-05-31 2018-11-23 浙江工业大学 Based on the bridge-type grab ship unloader fault prediction model method for improving correlation rule

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120239600A1 (en) * 2009-12-21 2012-09-20 International Business Machines Corporation Method for training and using a classification model with association rule models
CN201898519U (en) * 2010-09-01 2011-07-13 燕山大学 Equipment maintenance early-warning device with risk control
CN103676645A (en) * 2013-12-11 2014-03-26 广东电网公司电力科学研究院 Mining method for association rules in time series data flows
CN108873859A (en) * 2018-05-31 2018-11-23 浙江工业大学 Based on the bridge-type grab ship unloader fault prediction model method for improving correlation rule

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王玲等: ""基于频繁项集树的时态关联规则挖掘算法"", 《控制与决策》 *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112130541A (en) * 2020-10-20 2020-12-25 陕西煤业新型能源科技股份有限公司 Energy comprehensive management control system based on Internet of things
CN112380274B (en) * 2020-11-16 2023-08-22 北京航空航天大学 Abnormality detection method for control process
CN112380274A (en) * 2020-11-16 2021-02-19 北京航空航天大学 Control process-oriented anomaly detection system
CN112800686A (en) * 2021-03-29 2021-05-14 国网江西省电力有限公司电力科学研究院 Transformer DGA online monitoring data abnormal mode judgment method
CN112801426A (en) * 2021-04-06 2021-05-14 浙江浙能技术研究院有限公司 Industrial process fault fusion prediction method based on correlation parameter mining
CN113032912A (en) * 2021-04-20 2021-06-25 上海交通大学 Ship diesel engine fault detection method based on association rule
CN113792754A (en) * 2021-08-12 2021-12-14 国网江西省电力有限公司电力科学研究院 Method for processing DGA (differential global alignment) online monitoring data of converter transformer by removing different elements and then repairing
CN114936581A (en) * 2022-06-01 2022-08-23 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN114936581B (en) * 2022-06-01 2024-04-26 中国人民解放军63796部队 Multi-parameter association mining method based on time sequence data segmentation
CN115497267A (en) * 2022-09-06 2022-12-20 江西小手软件技术有限公司 Equipment early warning platform based on time sequence association rule
CN115689071A (en) * 2023-01-03 2023-02-03 南京工大金泓能源科技有限公司 Equipment fault fusion prediction method and system based on correlation parameter mining
CN116204842A (en) * 2023-03-10 2023-06-02 广东省建设工程质量安全检测总站有限公司 Abnormality monitoring method and system for electrical equipment
CN116204842B (en) * 2023-03-10 2023-09-08 广东省建设工程质量安全检测总站有限公司 Abnormality monitoring method and system for electrical equipment

Also Published As

Publication number Publication date
CN110008253B (en) 2021-02-23

Similar Documents

Publication Publication Date Title
CN110008253A (en) The industrial data association rule mining and unusual service condition prediction technique of strategy are generated based on two stages frequent item set
CN110018670A (en) A kind of industrial process unusual service condition prediction technique excavated based on dynamic association rules
JP7240691B1 (en) Data drive active power distribution network abnormal state detection method and system
CN109522600A (en) Complex equipment remaining life prediction technique based on combined depth neural network
Nathan et al. Estimating low flow characteristics in ungauged catchments
CN112508105B (en) Fault detection and retrieval method for oil extraction machine
CN109857090B (en) Health evaluation system and method for balanced air cylinder braking device
CN102789545A (en) Method for predicating remaining life of turbine engine based on degradation model matching
AU2016287383A1 (en) Method for detecting anomalies in a water distribution system
CN112200237B (en) Time sequence monitoring data abnormality diagnosis method for structural health monitoring system
CN111122162B (en) Industrial system fault detection method based on Euclidean distance multi-scale fuzzy sample entropy
CN111898644B (en) Intelligent identification method for health state of aerospace liquid engine under fault-free sample
CN104950875A (en) Fault diagnosis method by combining correlation analysis and data fusion
CN114004137A (en) Multi-source meteorological data fusion and pretreatment method
CN112507479B (en) Oil drilling machine health state assessment method based on manifold learning and softmax
Son et al. Deep learning-based anomaly detection to classify inaccurate data and damaged condition of a cable-stayed bridge
CN109471698A (en) System and method for detecting abnormal behavior of virtual machine in cloud environment
CN115310361A (en) Method and system for predicting underground dust concentration of coal mine based on WGAN-CNN
CN114492642A (en) Mechanical fault online diagnosis method for multi-scale element depth residual shrinkage network
CN114882069A (en) Taxi track abnormity detection method based on LSTM network and attention mechanism
CN113029619A (en) Underground scraper fault diagnosis method based on C4.5 decision tree algorithm
CN115982658A (en) Hydrological data anomaly identification and repair method based on federated learning framework
Mangaraj et al. A Markov chain analysis of daily rainfall occurrence at western Orissa of India
CN109299201A (en) Power plant's production subsystem method for monitoring abnormality and device based on two-phase analyzing method
CN104063622A (en) Complex system monitoring data visualization method based on similarity measurement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant