CN106446081A - Method for mining association relationship of time series data based on change consistency - Google Patents

Method for mining association relationship of time series data based on change consistency Download PDF

Info

Publication number
CN106446081A
CN106446081A CN201610814069.7A CN201610814069A CN106446081A CN 106446081 A CN106446081 A CN 106446081A CN 201610814069 A CN201610814069 A CN 201610814069A CN 106446081 A CN106446081 A CN 106446081A
Authority
CN
China
Prior art keywords
cluster
window
change
variable
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610814069.7A
Other languages
Chinese (zh)
Other versions
CN106446081B (en
Inventor
王文青
杨天社
鲍军鹏
张海龙
吴冠
李方正
王超
齐勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
China Xian Satellite Control Center
Original Assignee
Xian Jiaotong University
China Xian Satellite Control Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University, China Xian Satellite Control Center filed Critical Xian Jiaotong University
Priority to CN201610814069.7A priority Critical patent/CN106446081B/en
Publication of CN106446081A publication Critical patent/CN106446081A/en
Application granted granted Critical
Publication of CN106446081B publication Critical patent/CN106446081B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification
    • G06F16/287Visualization; Browsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a method for mining an association relationship of time series data based on change consistency. The method comprises the steps of firstly preprocessing time series data variables; secondly performing wavelet transform on a single variable, dividing an original time series into a plurality of windows by using a sliding window, performing discrete wavelet transform on each window, and extracting maximum wavelet detail coefficients; thirdly performing WDC gathering on the maximum wavelet detail coefficients of all the windows of the single variable to distinguish the windows with wavelet features different from those of the most windows, wherein the windows correspond to change points of the variable; and finally performing CCP clustering on the change points of all the variables, wherein the change points of the variables in a same cluster of a clustering result are similar, so that the variables have change consistency and are regarded to have a potential association relationship. According to the method, starting from the perspective of change consistency among the variables, the variables with a linear association relationship can be discovered and the variables with a complex nonlinear association relationship can be detected, so that the method has an important effect for associative analysis among variables of a large complex system.

Description

Based on the method that change concordance excavates time series data incidence relation
Technical field
The invention belongs to Intelligent Information Processing and field of computer technology, and in particular to a kind of based on change concordance excavation The method of time series data incidence relation.
Background technology
In large-scale complicated system, generally require to detect the incidence relation between multiple variables, this is for the system of summary fortune Professional etiquette rule, early warning are significant.The incidence relation of complexity, this incidence relation is there may be between variable in system Generally acted on by internal system rule.Relatedness can show as cooccurrence relation, cause effect relation, tendency relation on space-time Etc..When a variable changes, will cause different variables that corresponding change occurs.
Content of the invention
It is an object of the invention to provide a kind of method for excavating time series data incidence relation based on change concordance, the party Method integrated use wavelet transformation theory detects the change point of single variable, and clustering learning theory is investigating multivariate change Similarity between point vector, so as to potential incidence relation between discovery time sequence variables.
For reaching above-mentioned purpose, the technical scheme is that:
Based on the method that change concordance excavates time series data incidence relation, the system for realizing the method includes that data are located in advance Reason module, characteristic extracting module, WDC cluster module and CCP cluster module, which comprises the concrete steps that:
1) first, processed using pre- module 1-1 of data carries out elimination of burst noise, at equal intervals interpolation, normalizing to original temporal data Change operation, obtain the valid data form of sequential variable;
2) secondly, using characteristic extracting module 1-2, each window data of the valid data form of sequential variable is carried out Wavelet transform, extracts maximum wavelet detail coefficients;
3) and then, WDC is carried out to the maximum wavelet detail coefficients of all windows of single variable using WDC cluster module 1-3 Cluster, in cluster result less than window in the cluster of threshold value be change point;
4) last, CCP cluster is carried out to the change point vector of all variables using CCP cluster module 1-4, in cluster result Variable in same cluster is related, finally exports incidence relation and its intensity of each cluster internal variable.
Described data preprocessing module carries out elimination of burst noise, at equal intervals interpolation, normalization operation bag to original temporal data Include following steps:
First, average and the standard deviation of each window are calculated, judge each data point and watch window average which is located it Whether difference is more than the standard deviation of 5 times of watch window, if being more than, the data point is outlier, to reject;
Then, interpolation at equal intervals is carried out to the time serieses after elimination of burst noise, if the sampling interval is that △ t, initial time is T, Then the time collection at equal intervals after interpolation is combined into { T+n* △ t n=0,1,2,3 ... }, and the corresponding value of T+i* △ t is original sequence In row from the moment nearest less than first in value, the i.e. original series corresponding to T+i* △ t be more than T+i* △ t The previous moment corresponding to observation;
Finally, linear normalization is carried out to the data after interpolation operation at equal intervals, scans time serieses first, obtain The maximum (max) of observation and minima (min), according to formulaCalculate the number after each observation station normalization Value, original time series span is transformed on [0,1] interval, wherein, xiRepresent i-th observation station numerical value;△= max-min.
The characteristic extraction step of described characteristic extracting module includes:First, with sliding window, univariate data is carried out Cutting, if the Sampling starting point of initial data is t, the sampling interval is the n second, and it is l that window size is m, sliding distance, then first It is that first window initial time slides backward l that the time period of individual window is the initial time of t, t+n*m, two window, therefore The time period of two windows is t+l, t+l+n*m, by that analogy, obtains N number of window;
Secondly, discrete wavelet transformation is carried out to the data in each window, according to window size, the wavelet decomposition number of plies is set L, maximum wavelet details coefficient cD in selected windowiAs the feature of the window, [i, cDi] represent initial data in i-th The wavelet character of window.
The WDC sorting procedure of described WDC cluster module:
1) initialization of cluster, the independent cluster of each window, the cluster heart be the window characteristic vector of itself wavelet character [i, CDi], window number is denoted as m, and number of clusters mesh is denoted as n, now n=m;
2) error sum of squares SSE of cluster result, according to equation below, is calculatedn
Wherein, n represents the number of cluster;W represents the window number in a cluster;J represents the window subscript in cluster i;ciTable Show the cluster heart of cluster i;
3) the cluster heart distance of any two cluster, according to equation below, is calculated;
dist(ci,cj)=| ci-cj|i≠j
Wherein, dist (ci,cj) represent cluster i and cluster j manhatton distance;ci、cjRepresent the cluster heart of two clusters respectively;
4) two nearest clusters of combined distance and according to equation below change cluster center;
Wherein, c represents the cluster heart;W represents the window number in the cluster;cDiRepresent the maximum wavelet detail coefficients of window i;
5) n number subtracts 1;
6) repeat step 2) to 5) until n=1;
7) corresponding cluster result when SSE declines most fast is picked out according to equation below, is denoted as result={ c1,c2,… ck, k represents the number of clusters mesh of this layer of cluster result;
Wherein, i represents the number of plies of cluster;M is window number, that is, cluster the maximum number of plies;
8) distance of any two cluster in result is calculated, closest two cluster is picked out, is denoted as ci,cj
9) if dist is (ci,cj)≤d, d=0.2, then merge the two clusters, and calculates the cluster heart of new cluster, then repeats step Rapid 8;
10) if dist is (ci,cj)>D, then exit cluster process;
11) in cluster result, in less cluster, contained window is the Parameters variation point, and less cluster is exactly window in cluster Several ratios with total window number are less than the cluster of given threshold value 0.2, and all labels compared with window in tuftlets then constitute the change of the parameter Point set, i.e. cpv={ cp1,cp2,…,cpm, wherein cpiIt is window label.
The CCP sorting procedure of described CCP cluster module includes:
1) the independent cluster of single variable, is provided with n variable, and the number of cluster is designated as k, then k=n;
2) the change consistency coefficient CoC of any two cluster, according to equation below, is calculated:
Wherein, CoC (c) represents cluster c (ci, cjNew cluster after merging) change consistency coefficient;X, y are any two in cluster c Individual variable;Z is cluster internal variable number, and the combination of any two variable has z (z-1)/2 kind, the change consistency coefficient of a cluster It is equal to the meansigma methodss of the change consistency coefficient of all any two variables in cluster
Wherein, CoC (x, y) represents the change consistency coefficient of two variables x, y;|cpvx| represent the change point of variable x The number i.e. size of the Parameters variation point set;|cpvy| represent that the change of variable y is counted out;|cpvxy| represent variable x's and y Common change is counted out;
cpvxy=cpvx∩cpvy
Wherein, cpvx、cpvyRepresent the change point set of variable x, y respectively;
3) two most strong cluster c of change concordance are picked outi,cj, change consistency coefficient therebetween is denoted as max_ CoC;
4) if max_CoC is more than or equal to given threshold value 0.8, merge cluster ci,cj, k number subtracts 1, goes to step 2);
5) if max_CoC is less than given threshold value, cluster process is exited, in final cluster result, in same cluster Variable has incidence relation, and the strength of association between them is exactly the change consistency coefficient CoC of corresponding cluster.
Change concordance refers to that several sequential variables are always changed in the close moment.If that is, many Or almost changing together on individual variable longer period, or nearly all do not change again, these variables have potential Incidence relation.The present invention is that foundation excavates the variable with relatedness from a large amount of variables collections with the change concordance of variable Subset.With respect to prior art, the invention has the advantages that:The present invention is investigated many from change concordance angle Incidence relation between each and every one variable, this incidence relation can be nonlinear, and such as the function such as index, logarithm, multinomial is closed System.The relatedness that variable is showed under change is paid close attention to, and general association rule mining method is to excavate normally In the case of frequent mode.Traditional association rule mining method Apriori and FP-Tree is compared, the present invention is suitable for big Quantitative change amount is associated analysis, therefrom finds potential relatedness between parameter.
Description of the drawings
Fig. 1 is the module frame figure of present system.
Fig. 2 is WDC cluster module flow chart of the present invention.
Fig. 3 is CCP cluster module of the present invention.
Table 1 is the data simulation function of example sequential variable of the present invention.
Fig. 4 is the emulation datagraphic fragment of few examples sequential variable of the present invention.
Table 2 is example time series data variable association relation excavation result in CCP cluster module.
Specific embodiment
Below in conjunction with the accompanying drawings and embodiment is described in further detail to the present invention.
Referring to Fig. 1, the system for realizing the present invention includes data preprocessing module 1-1, characteristic extracting module 1-2, WDC cluster Module 1-3 and CCP cluster module 1-4;The concrete technical scheme of the present invention is:
Step one:Processed using pre- module 1-1 of data carries out elimination of burst noise, at equal intervals interpolation, normalizing to original temporal data Change operation, obtain the valid data form of sequential variable;
First, average and the standard deviation of each window are calculated, judge each data point and watch window average which is located it Whether difference is more than the standard deviation of 5 times of watch window, if being more than, the data point is outlier, to reject;
Then, interpolation at equal intervals is carried out to the time serieses after elimination of burst noise, if the sampling interval is that △ t, initial time is T, Then the time collection at equal intervals after interpolation is combined into { T+n* △ t n=0,1,2,3 ... }, and the corresponding value of T+i* △ t is original sequence In row from the moment nearest less than first in value, the i.e. original series corresponding to T+i* △ t be more than T+i* △ t The previous moment corresponding to observation;
Finally, linear normalization is carried out to the data after interpolation operation at equal intervals, scans time serieses first, obtain The maximum (max) of observation and minima (min), according to formulaAfter calculating each observation station normalization Numerical value, original time series span is transformed on [0,1] interval, wherein, xiRepresent i-th observation station numerical value;△= max-min;
Step 2:Secondly, using each window data of characteristic extracting module 1-2 to the valid data form of sequential variable Wavelet transform is carried out, extracts maximum wavelet detail coefficients;
First, with sliding window, univariate data is cut, if the Sampling starting point of initial data is t, sampling Interval is the n second, and it is l that window size is m, sliding distance, then the time period of first window is rising for t, t+n*m, two window Moment beginning is that first window initial time slides backward l, therefore the time period of second window is t+l, t+l+n*m, with such Push away, obtain N number of window;
Secondly, discrete wavelet transformation is carried out to the data in each window, according to window size, the wavelet decomposition number of plies is set L, maximum wavelet details coefficient cD in selected windowiAs the feature of the window, [i, cDi] represent initial data in i-th The wavelet character of window;
Step 3:Referring to Fig. 2, then, using WDC (Wavelet Detail Coefficient) cluster module 1-3 pair The maximum wavelet detail coefficients of all windows of single variable carry out WDC cluster, in cluster result less than window in the cluster of threshold value are Change point;
1) step 2-1 carried out first, the initialization of cluster, the independent cluster of each window, the cluster heart is the wavelet character of the window cDi, window number is denoted as m, and number of clusters mesh is denoted as n, now n=m;
2) and then step 2-2 is carried out, according to equation below, calculates error sum of squares SSE of cluster resultn(Sum of Squared Error);
Wherein, n represents the number of cluster;W represents the window number in a cluster;J represents the window subscript in cluster i;ciTable Show the cluster heart of cluster i;
3) execution step 2-3, according to equation below, calculates the cluster heart distance of any two cluster;
dist(ci,cj)=| ci-cj|i≠j
Wherein, dist (ci,cj) represent cluster i and cluster j manhatton distance;ci、cjRepresent the cluster heart of two clusters respectively;
4) execution step 2-4, two nearest clusters of combined distance and according to equation below change cluster center;
Wherein, c represents the cluster heart;W represents the window number in the cluster;cDiRepresent the maximum wavelet detail coefficients of window i;
5) execution step 2-5, n number subtracts 1;
6) execution step 2-6, repeat step 2) to 5) until n=1;
7) execution step 2-7, picks out corresponding cluster result when SSE declines most fast according to equation below, is denoted as Result={ c1,c2,…ck, k represents the number of clusters mesh of this layer of cluster result;
Wherein, i represents the number of plies of cluster;M is window number, that is, cluster the maximum number of plies;
8) execution step 2-8, calculates the distance of any two cluster in result, picks out closest two cluster, note Make ci,cj
9) execution step 2-9, if dist is (ci,cj)≤d, d=0.2), then merge the two clusters, and calculate the cluster of new cluster The heart, then repeat step 8;
10) execution step 2-10, if dist is (ci,cj)>D, then exit cluster process;
11) in cluster result, in less cluster, contained window is the Parameters variation point, and less cluster is exactly window in cluster Several ratios with total window number are less than the cluster of given threshold value 0.2, and all labels compared with window in tuftlets then constitute the change of the parameter Point set, i.e. cpv={ cp1,cp2,…,cpm, wherein cpiIt is window label.
Step 4:With reference to Fig. 3, finally, using CCP (Clustering based on Change Point) cluster module 1-4 carries out CCP cluster to the change point vector of all variables, the variable in cluster result in same cluster be related, finally Export incidence relation and its intensity of each cluster internal variable;
1) step 3-1 carried out first, the independent cluster of single variable, n variable is provided with, the number of cluster is designated as k, then k=n;
2) execution step 3-2, according to equation below, calculates the change consistency coefficient CoC of any two cluster:
Wherein, CoC (c) represents cluster c (ci, cjNew cluster after merging) change consistency coefficient;X, y are any two in cluster c Individual variable;Z is cluster internal variable number, and the combination of any two variable has z (z-1)/2 kind, the change consistency coefficient of a cluster It is equal to the meansigma methodss of the change consistency coefficient of all any two variables in cluster
Wherein, CoC (x, y) represents the change consistency coefficient of two variables x, y;|cpvx| represent the change point of variable x Number (i.e. the size of the Parameters variation point set);|cpvy| represent that the change of variable y is counted out;|cpvxy| represent variable x and y Common change count out;
cpvxy=cpvx∩cpvy
Wherein, cpvx、cpvyRepresent the change point set of variable x, y respectively;
3) execution step 3-3, picks out two most strong cluster c of change concordancei,cj, change concordance system therebetween Number scale makees max_CoC;
4) execution step 3-4, if max_CoC is more than or equal to given threshold value 0.8, merges cluster ci,cj, k number subtracts 1, turns Step 2);
5) execution step 3-5, if max_CoC is less than given threshold value, exits cluster process, in final cluster result, Variable in same cluster has incidence relation, and the strength of association between them is exactly the change consistency coefficient of corresponding cluster CoC.
With reference to table 1, which is example time series data variable simulated function, according to simulated function, simulates each variable 20 days Data, the sampling interval is 20 minutes.Three groups of correlated variabless are wherein had, includes 11 variables, A group variable and g per group1(x) phase Pass, B group variable and g2(x) correlation, C group variable and g3X () correlation, formula is as follows:
Table 1
With reference to Fig. 4, which is the emulation datagraphic fragment of few examples time series data variable.In figure yellow, white bars mark The part of note represents window, wherein the maximum wavelet detail coefficients of i-th window of " cDi " expression.
With reference to table 2, which is example time series data variable association relation excavation result in CCP cluster module, wherein same Variable in cluster is considered to have incidence relation, and the strength of association between them is exactly the change concordance system of corresponding cluster Number CoC.
Table 2

Claims (5)

1. the method for time series data incidence relation being excavated based on change concordance, it is characterised in that:Realize the system bag of the method Data preprocessing module (1-1), characteristic extracting module (1-2), WDC cluster module (1-3) and CCP cluster module (1-4) is included, its Comprise the concrete steps that:
1) first, processed using the pre- module of data (1-1) carries out elimination of burst noise, at equal intervals interpolation, normalization to original temporal data Operation, obtains the valid data form of sequential variable;
2) secondly, using characteristic extracting module (1-2) each window data of the valid data form of sequential variable is carried out from Scattered wavelet transformation, extracts maximum wavelet detail coefficients;
3) and then, using WDC cluster module (1-3) the maximum wavelet detail coefficients of all windows of single variable are carried out WDC gather Class, in cluster result less than window in the cluster of threshold value be change point;
4) last, CCP cluster is carried out to the change point vector of all variables using CCP cluster module (1-4), same in cluster result Variable in one cluster is related, finally exports incidence relation and its intensity of each cluster internal variable.
2. according to claim 1 based on change concordance excavate time series data incidence relation method, it is characterised in that: Described data preprocessing module (1-1) original temporal data are carried out elimination of burst noise, at equal intervals interpolation, normalization operation include with Lower step:
First, average and the standard deviation of each window is calculated, judges that the difference that each data point is located watch window average with which is The standard deviation of the no watch window for being more than 5 times, if being more than, the data point is outlier, to reject;
Then, interpolation at equal intervals is carried out to the time serieses after elimination of burst noise, if the sampling interval is that △ t, initial time is T, then etc. Time collection after the interpolation of interval is combined into { T+n* △ t n=0,1,2,3 ... }, and the corresponding value of T+i* △ t is in original series From the moment nearest less than first in value, the i.e. original series corresponding to T+i* △ t more than before T+i* △ t Observation corresponding to one moment;
Finally, linear normalization is carried out to the data after interpolation operation at equal intervals, scans time serieses first, obtain observation The maximum (max) of value and minima (min), according to formulaThe numerical value after each observation station normalization is calculated, Original time series span is transformed on [0,1] interval, wherein, xiRepresent i-th observation station numerical value;△=max- min.
3. according to claim 1 based on change concordance excavate time series data incidence relation method, it is characterised in that The characteristic extraction step of described characteristic extracting module (1-2) includes:First, with sliding window, univariate data is cut Cut, if the Sampling starting point of initial data is t, the sampling interval is the n second, it is l that window size is m, sliding distance, then first It is that first window initial time slides backward l that the time period of window is the initial time of t, t+n*m, two window, therefore second The time period of individual window is t+l, t+l+n*m, by that analogy, obtains N number of window;
Secondly, discrete wavelet transformation is carried out to the data in each window, according to window size, wavelet decomposition number of plies L, choosing is set Take maximum wavelet details coefficient cD in windowiAs the feature of the window, [i, cDi] represent initial data in i-th window Wavelet character.
4. according to claim 3 based on change concordance excavate time series data incidence relation method, it is characterised in that The WDC sorting procedure of described WDC cluster module (1-3) includes:
1) initialization of cluster, the independent cluster of each window, the cluster heart be the window characteristic vector of itself wavelet character [i, CDi], window number is denoted as m, and number of clusters mesh is denoted as n, now n=m;
2) error sum of squares SSE of cluster result, according to equation below, is calculatedn
SSE n = Σ i = 1 n Σ j = 1 w ( cD j - c i ) 2
Wherein, n represents the number of cluster;W represents the window number in a cluster;J represents the window subscript in cluster i;ciRepresent cluster i The cluster heart;
3) the cluster heart distance of any two cluster, according to equation below, is calculated;
dist(ci,cj)=| ci-cj|i≠j
Wherein, dist (ci,cj) represent cluster i and cluster j manhatton distance;ci、cjRepresent the cluster heart of two clusters respectively;
4) two nearest clusters of combined distance and according to equation below change cluster center;
c = 1 w Σ i = 1 w cD i
Wherein, c represents the cluster heart;W represents the window number in the cluster;cDiRepresent the maximum wavelet detail coefficients of window i;
5) n number subtracts 1;
6) repeat step 2) to 5) until n=1;
7) corresponding cluster result when SSE declines most fast is picked out according to equation below, is denoted as result={ c1,c2,…ck, k Represent the number of clusters mesh of this layer of cluster result;
max { SSE i SSE i - 1 } , i = 2 , 3 , ... m
Wherein, i represents the number of plies of cluster;M is window number, that is, cluster the maximum number of plies;
8) distance of any two cluster in result is calculated, closest two cluster is picked out, is denoted as ci,cj
9) if dist is (ci,cj)≤d, d=0.2, then merge the two clusters, and calculates the cluster heart of new cluster, then repeat step 8;
10) if dist is (ci,cj)>D, then exit cluster process;
11) in cluster result, in less cluster, contained window is the Parameters variation point, less cluster be exactly in cluster window number and The ratio of total window number then constitutes the change point set of the parameter less than the cluster of given threshold value 0.2, all labels compared with window in tuftlet Close, i.e. cpv={ cp1,cp2,…,cpm, wherein cpiIt is window label.
5. according to claim 1 based on change concordance excavate time series data incidence relation method, it is characterised in that: The CCP sorting procedure of described CCP cluster module (1-4) includes:
1) the independent cluster of single variable, is provided with n variable, and the number of cluster is designated as k, then k=n;
2) the change consistency coefficient CoC of any two cluster, according to equation below, is calculated:
Wherein, CoC (c) represents cluster c (ci, cjNew cluster after merging) change consistency coefficient;X, y are that in cluster c, any two becomes Amount;Z is cluster internal variable number, and the combination of any two variable has a z (z-1)/2 kind, and the change consistency coefficient of a cluster is just etc. The meansigma methodss of the change consistency coefficient of all any two variables in cluster:
C o C ( x , y ) = 2 | cpv x y | | cpv x | + | cpv y |
Wherein, CoC (x, y) represents the change consistency coefficient of two variables x, y;|cpvx| represent that the change of variable x is counted out i.e. The size of the Parameters variation point set;|cpvy| represent that the change of variable y is counted out;|cpvxy| represent the common change of variable x and y Change is counted out;
cpvxy=cpvx∩cpvy
Wherein, cpvx、cpvyRepresent the change point set of variable x, y respectively;
3) two most strong cluster c of change concordance are picked outi,cj, change consistency coefficient therebetween is denoted as max_CoC;
4) if max_CoC is more than or equal to given threshold value 0.8, merge cluster ci,cj, k number subtracts 1, goes to step 2);
5) if max_CoC is less than given threshold value, cluster process is exited, the variable in final cluster result, in same cluster With incidence relation, and the strength of association between them is exactly the change consistency coefficient CoC of corresponding cluster.
CN201610814069.7A 2016-09-09 2016-09-09 The method for excavating time series data incidence relation based on variation consistency Expired - Fee Related CN106446081B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610814069.7A CN106446081B (en) 2016-09-09 2016-09-09 The method for excavating time series data incidence relation based on variation consistency

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610814069.7A CN106446081B (en) 2016-09-09 2016-09-09 The method for excavating time series data incidence relation based on variation consistency

Publications (2)

Publication Number Publication Date
CN106446081A true CN106446081A (en) 2017-02-22
CN106446081B CN106446081B (en) 2019-08-13

Family

ID=58169070

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610814069.7A Expired - Fee Related CN106446081B (en) 2016-09-09 2016-09-09 The method for excavating time series data incidence relation based on variation consistency

Country Status (1)

Country Link
CN (1) CN106446081B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948646A (en) * 2019-01-24 2019-06-28 西安交通大学 A kind of time series data method for measuring similarity and gauging system
CN112231326A (en) * 2020-09-30 2021-01-15 新华三大数据技术有限公司 Method and server for detecting Ceph object
CN113282645A (en) * 2021-07-23 2021-08-20 广东粤港澳大湾区硬科技创新研究院 Satellite time sequence parameter analysis method, system, terminal and storage medium
CN116340796A (en) * 2023-05-22 2023-06-27 平安科技(深圳)有限公司 Time sequence data analysis method, device, equipment and storage medium
CN117472915A (en) * 2023-12-27 2024-01-30 中国西安卫星测控中心 Hierarchical storage method of time sequence data oriented to multiple Key values

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105205111A (en) * 2015-09-01 2015-12-30 西安交通大学 System and method for mining failure modes of time series data
US20160189183A1 (en) * 2014-12-31 2016-06-30 Flytxt BV System and method for automatic discovery, annotation and visualization of customer segments and migration characteristics
CN105843919A (en) * 2016-03-24 2016-08-10 云南大学 Moving object track clustering method based on multi-feature fusion and clustering ensemble

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160189183A1 (en) * 2014-12-31 2016-06-30 Flytxt BV System and method for automatic discovery, annotation and visualization of customer segments and migration characteristics
CN105205111A (en) * 2015-09-01 2015-12-30 西安交通大学 System and method for mining failure modes of time series data
CN105843919A (en) * 2016-03-24 2016-08-10 云南大学 Moving object track clustering method based on multi-feature fusion and clustering ensemble

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948646A (en) * 2019-01-24 2019-06-28 西安交通大学 A kind of time series data method for measuring similarity and gauging system
CN112231326A (en) * 2020-09-30 2021-01-15 新华三大数据技术有限公司 Method and server for detecting Ceph object
CN112231326B (en) * 2020-09-30 2022-08-30 新华三大数据技术有限公司 Method and server for detecting Ceph object
CN113282645A (en) * 2021-07-23 2021-08-20 广东粤港澳大湾区硬科技创新研究院 Satellite time sequence parameter analysis method, system, terminal and storage medium
CN116340796A (en) * 2023-05-22 2023-06-27 平安科技(深圳)有限公司 Time sequence data analysis method, device, equipment and storage medium
CN116340796B (en) * 2023-05-22 2023-12-22 平安科技(深圳)有限公司 Time sequence data analysis method, device, equipment and storage medium
CN117472915A (en) * 2023-12-27 2024-01-30 中国西安卫星测控中心 Hierarchical storage method of time sequence data oriented to multiple Key values
CN117472915B (en) * 2023-12-27 2024-03-15 中国西安卫星测控中心 Hierarchical storage method of time sequence data oriented to multiple Key values

Also Published As

Publication number Publication date
CN106446081B (en) 2019-08-13

Similar Documents

Publication Publication Date Title
CN106446081A (en) Method for mining association relationship of time series data based on change consistency
CN104142918A (en) Short text clustering and hotspot theme extraction method based on TF-IDF characteristics
CN104462184B (en) A kind of large-scale data abnormality recognition method based on two-way sampling combination
CN110995475A (en) Power communication network fault detection method based on transfer learning
CN100507971C (en) Independent component analysis based automobile sound identification method
KR101232945B1 (en) Two-class classifying/predicting model making method, computer readable recording medium recording classifying/predicting model making program, and two-class classifying/predicting model making device
CN105528516A (en) Clinic pathology data classification method based on combination of principal component analysis and extreme learning machine
CN111882446A (en) Abnormal account detection method based on graph convolution network
CN108875772B (en) Fault classification model and method based on stacked sparse Gaussian Bernoulli limited Boltzmann machine and reinforcement learning
CN105095238A (en) Decision tree generation method used for detecting fraudulent trade
CN111401573B (en) Working condition state modeling and model correcting method
CN105701470A (en) Analog circuit fault characteristic extraction method based on optimal wavelet packet decomposition
CN111000553A (en) Intelligent classification method for electrocardiogram data based on voting ensemble learning
CN109165672A (en) A kind of Ensemble classifier method based on incremental learning
CN106126910A (en) State Transferring Forecasting Methodology based on Markov state metastasis model and system
CN112434662B (en) Tea leaf scab automatic identification algorithm based on multi-scale convolutional neural network
CN110276357A (en) A kind of method for recognizing verification code based on convolutional neural networks
CN104796365A (en) Modulating signal recognition method based on complexity feature under low signal to noise ratio
CN105426441A (en) Automatic pre-processing method for time series
CN106649438A (en) Time series data unexpected fault detection method
CN103995873B (en) A kind of data digging method and data digging system
CN110059126B (en) LKJ abnormal value data-based complex correlation network analysis method and system
CN116340746A (en) Feature selection method based on random forest improvement
CN106548136A (en) A kind of wireless channel scene classification method
CN107122394A (en) Abnormal deviation data examination method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190813