CN108536862A - A kind of Time Series Similarity measure based on dynamic time warping - Google Patents

A kind of Time Series Similarity measure based on dynamic time warping Download PDF

Info

Publication number
CN108536862A
CN108536862A CN201810355812.6A CN201810355812A CN108536862A CN 108536862 A CN108536862 A CN 108536862A CN 201810355812 A CN201810355812 A CN 201810355812A CN 108536862 A CN108536862 A CN 108536862A
Authority
CN
China
Prior art keywords
time series
dist
time
length
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810355812.6A
Other languages
Chinese (zh)
Inventor
刘良桂
李炜
贾会玲
张宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Sci Tech University ZSTU
Original Assignee
Zhejiang Sci Tech University ZSTU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Sci Tech University ZSTU filed Critical Zhejiang Sci Tech University ZSTU
Priority to CN201810355812.6A priority Critical patent/CN108536862A/en
Publication of CN108536862A publication Critical patent/CN108536862A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Measurement Of Unknown Time Intervals (AREA)

Abstract

The present invention discloses a kind of Time Series Similarity measure, and this method combines dynamic time warping algorithm and derivative dynamic time warping algorithm, increases the accuracy of Time Series Similarity measurement, solid foundation is provided for the follow-up study of time series.

Description

A kind of Time Series Similarity measure based on dynamic time warping
Technical field
The present invention relates to the method for measuring similarity between data analysis field more particularly to time series.
Background technology
Nowadays, with the continuous development of Internet technology, electronic equipment and software technology, all trades and professions are at every moment all Breaking out huge data, exponentially type increases the size of data, presents that data scale is big, data class is more, updating decision And intrinsic value the characteristics of reaching.Time series be it is very common in a kind of actual life and with association in time, have successively time The sequence of values or symbol sebolic addressing of sequence, it is especially common in industries such as economy, weather, biologic medicals, while some non-time series Data can also be converted into time series data to be analyzed.Therefore, how to be excavated from the time series data of magnanimity hiding The useful information is that current Data Mining needs one of content of primary study.
Time Series Data Mining is the sub- content of core of Data Mining, and application range is very extensive.As when Between important foundation Journal of Sex Research in Series Data Mining, Time Series Similarity measurement is before other data mining tasks are realized It carries, such as classification, cluster, abnormality detection and pattern-recognition etc..Therefore, from certain angle, Time Series Similarity degree The quality of amount performance decides the efficiency of Time series data mining algorithm to a certain extent.The similitude of time series There are many measure, common are Euclidean distance (Euclidean Distance, ED), dynamic time warping (Dynamic Time Warping, DTW) etc..But in numerous measures in calculating process, only calculate two time serieses away from From, and the shape feature of time series is not considered.It would therefore be desirable to have better method, calculate time series apart from while, The shape feature of time series is taken into account.
Invention content
In order to preferably calculate the similitude between time series, the present invention provides the sides for calculating Time Series Similarity Method not only allows for the distance between time series, it is also contemplated that the shape feature between time series, specific technical solution is such as Under:
(1) length of m time serieses to be measured uniformly is arranged to n, time series to be measured not less than m n In maximum length sequence length;
(2) time series that m length is n is formed into a matrix Tm×n
(3) by PCA dimension-reduction algorithms to matrix Tm×nDimensionality reduction is carried out, new matrix T is obtainedm×l, after wherein l indicates dimensionality reduction Length of time series.
(4) calculating matrix Tm×lIn two time serieses (A and B) between DTW distances Dist1
(5) calculating matrix Tm×lIn each time series derivative, constitute derivative time sequence, then calculate in step 4 again Two time serieses A and B derivative time sequence between DTW distances Dist2, i.e. DDTW distances of time series.
(6) the Time Series Similarity size finally calculated is Dist=α * Dist1+(1-α)*Dist2, wherein α ∈ (0, 1)。
(7) according to similitude size Dist, cluster operation is carried out, is calculated between cluster result and similitude size Dist Homologous related coefficient;Different α values are taken, are sought so that maximum α ' the values of homologous related coefficient.
(8) it is worth according to the α ' that step 7 obtains, obtains final similitude size Dist=α ' the * of time series A and B Dist1+(1-α')*Dist2
Further, in the step 1, n is the length of the maximum length sequence in m time serieses to be measured.
Further, in the step 1, the time series of n is less than for sequence length, 0 is mended at sequence end, is allowed to long Degree is n.
Time Series Similarity measure according to the present invention, during calculating Time Series Similarity, no The distance between time series size is only calculated, is also taken into account the shape feature of time series so that time series Similarity measurement is more accurate.
Description of the drawings
The calculated homologous related coefficient size of Fig. 1 distinct methods
Specific implementation mode
The Time Series Similarity measure of the present invention is further explained with reference to specific embodiment It states.
The present invention provides the methods for calculating Time Series Similarity, not only allow for the distance between time series, also Consider the shape feature between time series.Below with the time series power of communication histories of mobile phone, the present invention is made specific It is described as follows:
1. the length of 2076 time serieses to be measured uniformly is arranged to 4032, described 4032 to wait measuring for 2076 Time series in maximum length sequence length, for sequence length be less than 4032 time series, sequence end mend 0, make Length be 4032, i.e. m=2076, n=4032;
2. the time series that 2076 length are 4032 is formed a matrix Tm×n
3. by PCA dimension-reduction algorithms to matrix Tm×nDimensionality reduction is carried out, new matrix T is obtainedm×l, after wherein l indicates dimensionality reduction Length of time series, i.e. l=8.
4. calculating matrix Tm×lDTW distances Dist between middle any two time series1
5. calculating matrix Tm×lIn each time series derivative, constitute derivative time sequence, then calculate any two again DTW distances Dist between derivative time sequence2, i.e. DDTW distances of time series.
6. the Time Series Similarity size finally calculated is Dist=α * Dist1+(1-α)*Dist2, wherein α ∈ (0, 1)。
7. according to similitude size Dist, cluster operation is carried out, is calculated between cluster result and similitude size Dist Homologous related coefficient;Different α values are taken, are sought so that maximum α ' the values of homologous related coefficient.
8. according to the α values that step 7 obtains, final similitude size Dist=α ' the * Dist of time series are obtained1+(1- α')*Dist2
It can be seen in the drawings that being higher than using DTW and making by using the homologous related coefficient that DDTW methods obtain (such as with traditional method for measuring similarity:Euclidean distance), meanwhile, use homologous phase relation obtained by method of the present invention Number obtains best effect under certain α values, it follows that method of the present invention, can more accurately reflect two Similarity between time series.

Claims (3)

1. a kind of Time Series Similarity measure, which is characterized in that include the following steps:
(1) length of m time serieses to be measured uniformly is arranged to n, n is not less than in a time serieses to be measured of m The length of maximum length sequence;
(2) time series that m length is n is formed into a matrix Tm×n
(3) by PCA dimension-reduction algorithms to matrix Tm×nDimensionality reduction is carried out, new matrix T is obtainedm×l, wherein l indicate dimensionality reduction after time Sequence length.
(4) calculating matrix Tm×lIn two time serieses (A and B) between DTW distances Dist1
(5) calculating matrix Tm×lIn each time series derivative, constitute derivative time sequence, then calculate two in step 4 again DTW distances Dist between the derivative time sequence of a time series A and B2, i.e. DDTW distances of time series.
(6) the Time Series Similarity size finally calculated is Dist=α * Dist1+(1-α)*Dist2, wherein α ∈ (0,1).
(7) according to similitude size Dist, cluster operation is carried out, is calculated homologous between cluster result and similitude size Dist Related coefficient;Different α values are taken, are sought so that maximum α ' the values of homologous related coefficient.
(8) it is worth according to the α ' that step 7 obtains, obtains final similitude size Dist=α ' the * Dist of time series A and B1+(1- α')*Dist2
2. according to the method described in claim 1, it is characterized in that, in the step 1, n is in m time serieses to be measured Maximum length sequence length.
3. according to the method described in claim 1, it is characterized in that, in the step 1, the time of n is less than for sequence length Sequence mends 0 at sequence end, and it is n to be allowed to length.
CN201810355812.6A 2018-04-19 2018-04-19 A kind of Time Series Similarity measure based on dynamic time warping Pending CN108536862A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810355812.6A CN108536862A (en) 2018-04-19 2018-04-19 A kind of Time Series Similarity measure based on dynamic time warping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810355812.6A CN108536862A (en) 2018-04-19 2018-04-19 A kind of Time Series Similarity measure based on dynamic time warping

Publications (1)

Publication Number Publication Date
CN108536862A true CN108536862A (en) 2018-09-14

Family

ID=63478644

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810355812.6A Pending CN108536862A (en) 2018-04-19 2018-04-19 A kind of Time Series Similarity measure based on dynamic time warping

Country Status (1)

Country Link
CN (1) CN108536862A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109596929A (en) * 2019-01-31 2019-04-09 国家电网有限公司 A kind of voltage curve similitude judgment method considering the asynchronous influence of clock

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109596929A (en) * 2019-01-31 2019-04-09 国家电网有限公司 A kind of voltage curve similitude judgment method considering the asynchronous influence of clock

Similar Documents

Publication Publication Date Title
CN107703480B (en) Mixed kernel function indoor positioning method based on machine learning
CN109297689B (en) Large-scale hydraulic machinery intelligent diagnosis method introducing weight factors
CN109783604B (en) Information extraction method and device based on small amount of samples and computer equipment
CN107682319A (en) A kind of method of data flow anomaly detection and multiple-authentication based on enhanced angle Outlier factor
CN102693452A (en) Multiple-model soft-measuring method based on semi-supervised regression learning
CN107957946B (en) Software defect prediction method based on neighborhood embedding protection algorithm support vector machine
CN108985065B (en) Method and system for detecting firmware bugs by applying improved Mahalanobis distance calculation method
CN110472695B (en) Abnormal working condition detection and classification method in industrial production process
CN113259331A (en) Unknown abnormal flow online detection method and system based on incremental learning
CN108960342B (en) Image similarity calculation method based on improved Soft-Max loss function
CN103885867B (en) Online evaluation method of performance of analog circuit
CN105678409A (en) Adaptive and distribution-free time series abnormal point detection method
CN110837874A (en) Service data abnormity detection method based on time series classification
Wen et al. A new method for identifying the ball screw degradation level based on the multiple classifier system
US20190095876A1 (en) Method and system for determining maintenance policy of complex forming device
CN105224941A (en) Process identification and localization method
KR20190099811A (en) Method and apparatus for predicting time series signal using RNN
Le et al. A novel wifi indoor positioning method based on genetic algorithm and twin support vector regression
CN111737294A (en) Data flow classification method based on dynamic increment integration fuzzy
Pathan et al. Efficient forecasting of precipitation using LSTM
CN108536862A (en) A kind of Time Series Similarity measure based on dynamic time warping
Ko et al. Feedforward error learning deep neural networks for multivariate deterministic power forecasting
CN113962954A (en) Surface defect detection method based on SE-R-YOLOV4 automobile steel part
CN113110961A (en) Equipment abnormality detection method and device, computer equipment and readable storage medium
CN116595857A (en) Rolling bearing multistage degradation residual life prediction method based on deep migration learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180914

WD01 Invention patent application deemed withdrawn after publication