WO2021012315A1 - 基于模糊匹配的时间序列异常模式识别方法及装置 - Google Patents

基于模糊匹配的时间序列异常模式识别方法及装置 Download PDF

Info

Publication number
WO2021012315A1
WO2021012315A1 PCT/CN2019/099467 CN2019099467W WO2021012315A1 WO 2021012315 A1 WO2021012315 A1 WO 2021012315A1 CN 2019099467 W CN2019099467 W CN 2019099467W WO 2021012315 A1 WO2021012315 A1 WO 2021012315A1
Authority
WO
WIPO (PCT)
Prior art keywords
time series
processing
matching
data
noise
Prior art date
Application number
PCT/CN2019/099467
Other languages
English (en)
French (fr)
Inventor
马晓彬
都志辉
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2021012315A1 publication Critical patent/WO2021012315A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • the present invention relates to the technical field of data processing, in particular to a method and device for identifying abnormal patterns of time series based on fuzzy matching.
  • the present invention aims to solve one of the technical problems in the related art at least to a certain extent.
  • an object of the present invention is to propose a time series abnormal pattern recognition method based on fuzzy matching, which can achieve the purpose of performing sequential pattern search on the basis of appropriate scale and effectiveness.
  • Another object of the present invention is to provide a time series abnormal pattern recognition device based on fuzzy matching.
  • one embodiment of the present invention proposes a time series abnormal pattern recognition method based on fuzzy matching, which includes the following steps: missing value processing, white noise processing and outlier processing are performed on the original time series to obtain A time series meeting a preset condition; detecting turning points in the time series meeting a preset condition to divide the time series meeting the preset condition into a plurality of segmented sub-sequences; according to each segmented sub-sequence The sequence segment is searched for the start and end points of the sequence, and matched through the DTW (Dynamic Time Warping) algorithm to obtain the matching result.
  • DTW Dynamic Time Warping
  • the time series abnormal pattern recognition method based on fuzzy matching in the embodiment of the present invention can tolerate the fuzzy matching of samples in different situations under the influence of time domain, amplitude, offset, and noise. It is not necessary to precisely define all abnormal patterns.
  • the template can search for the pattern fragments of interest; by using the noise in the data, the signal-to-noise ratio is defined for the segmentation of time series data and the corresponding acceleration mode optimization, so as to realize the sequence mode based on the appropriate scale and effectiveness The purpose of the search.
  • time series abnormal pattern recognition method based on fuzzy matching may also have the following additional technical features:
  • the performing missing value processing, white noise processing, and outlier processing on the data includes: performing data interruption slicing on the original time series according to a threshold of a preset interval to obtain Multiple independent time series fragments; time series fragments whose overall density is lower than the first preset threshold are eliminated, and linear interpolation processing is performed on the remaining time series fragments to obtain time series fragments after missing value processing.
  • the performing missing value processing, white noise processing, and outlier processing on the data further includes: performing wavelet decomposition on the time series segment after the missing value processing to obtain each The wavelet coefficients of each segment; threshold processing is performed on the wavelet coefficients of each segment, so as to perform noise reduction processing on segments with wavelet coefficients less than the second preset threshold.
  • the performing missing value processing, white noise processing, and outlier processing on the data further includes: determining the noise intensity according to the difference between the wavelet transform and the original data; The system noise whose intensity exceeds the random error threshold is invalidated.
  • the method before the matching is performed by the DTW algorithm, the method further includes: performing Z-Score standardization on each segmented subsequence to match the matching template to the target amplitude signal of.
  • a time series abnormal pattern recognition device based on fuzzy matching, which includes: a processing module for performing missing value processing, white noise processing and outlier processing on the original time series Processing to obtain a time series meeting preset conditions; a detection module for detecting turning points in the time series meeting the preset conditions, so as to divide the time series meeting the preset conditions into a plurality of segments Sequence; matching module, used to search for the sequence segment obtained according to the start and end points of each segmented subsequence, and match it through the DTW algorithm to obtain the matching result.
  • the time series abnormal pattern recognition device based on fuzzy matching in the embodiment of the present invention can tolerate the fuzzy matching of samples in different situations under the influence of time domain, amplitude, offset, and noise. It does not need to accurately define all abnormal patterns.
  • the template can search for the pattern fragments of interest; by using the noise in the data, the signal-to-noise ratio is defined for the segmentation of time series data and the corresponding acceleration mode optimization, so as to realize the sequence mode based on the appropriate scale and effectiveness The purpose of the search.
  • time series abnormal pattern recognition device based on fuzzy matching may also have the following additional technical features:
  • the processing module is further configured to perform data interruption slicing on the original time series according to a threshold of a preset interval to obtain a plurality of independent time series fragments, and eliminate low overall density. Perform linear interpolation processing on the time series segment with the first preset threshold value, and obtain the time series segment with missing value processing.
  • the processing module is further configured to perform wavelet decomposition on the time series segment after the missing value processing to obtain the wavelet coefficient of each segment, and calculate the wavelet coefficient of each segment Perform threshold processing to perform noise reduction processing on segments with wavelet coefficients less than the second preset threshold.
  • the pair processing module is further configured to determine the noise intensity according to the difference value between the wavelet transform and the original data, and perform invalid processing on the system noise whose noise intensity exceeds the random error threshold.
  • a standardization module which is used to perform Z-Score standardization on each segmented subsequence before matching through the DTW algorithm, so as to match the template A signal that matches the target amplitude.
  • FIG. 1 is a flowchart of a method for identifying abnormal patterns of time series based on fuzzy matching according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of sequence segmentation and screening based on density and absolute deletion length according to an embodiment of the present invention
  • Fig. 3 is a schematic diagram of steps for noise reduction using wavelet transform according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of the definition of vertical PIP distance metric when adding new key points according to an embodiment of the present invention
  • FIG. 5 is a schematic diagram of a stellar flare discovered by a fuzzy matching algorithm according to an embodiment of the present invention.
  • Fig. 6 is a schematic structural diagram of a time series abnormal pattern recognition device based on fuzzy matching according to an embodiment of the present invention.
  • abnormal pattern Before introducing the method and device of time series abnormal pattern recognition based on fuzzy matching, the definition of abnormal pattern is given as follows:
  • an abnormal pattern can be defined and a real abnormal sample can be obtained by searching in a sequence for in-depth research.
  • pattern abnormality is given: For multiple continuous brightness segments such as S1,..., Sn, the given characteristics are similar to the mode of interest in the embodiment of the present invention in the change process, which can be If the astronomical phenomenon is considered to be the same type, it can be determined that the consecutive magnitude values of the segment are the search target of the embodiment of the present invention.
  • the definition of similarity, the given template and the corresponding similarity measurement method here reflect the tendency of the embodiment of the present invention to the target mode.
  • Fig. 1 is a flowchart of a method for identifying an abnormal pattern of time series based on fuzzy matching according to an embodiment of the present invention.
  • the method for identifying time series abnormal patterns based on fuzzy matching includes the following steps:
  • step S101 missing value processing, white noise processing, and outlier processing are performed on the original time series to obtain a time series meeting preset conditions.
  • the time series that meets the preset conditions can be understood as a continuous light brightness point with no abnormal single discrete point and relatively low and stable random noise level.
  • the method of the embodiment of the present invention is an active search and matching for abnormal patterns, which mainly includes two parts: preprocessing of data, screening of candidate subsequences, and similarity matching.
  • preprocessing of data mainly includes two parts: preprocessing of data, screening of candidate subsequences, and similarity matching.
  • the missing data will make the matching algorithm invalid, the outliers caused by the influence of instruments and abnormal factors will make the matching invalid, and random noise will affect the embodiments of the present invention.
  • the accuracy of the similarity measurement of data fragments Therefore, the embodiment of the present invention solves the three problems by corresponding means to ensure that the data entering the subsequent pruning matching stage is a continuous light brightness point with no abnormal single discrete point abnormality and a relatively low random noise level.
  • performing missing value processing, white noise processing, and outlier processing on the data includes: performing data interruption slicing on the original time series according to a threshold of a preset interval to obtain multiple independent Time series fragments: remove time series fragments whose overall density is lower than the first preset threshold, and perform linear interpolation processing on the remaining time series fragments to obtain time series fragments after missing value processing.
  • the is_valid function is a function to judge whether the data point has a value, if there is a value, it is 1, otherwise it is 0.
  • the embodiment of the present invention does not save the data at all times, but only saves the effective data time stamp (the floating point value of Julian Day JD) sequence and the corresponding magnitude brightness value.
  • the Julian Day defines a day as 86400 seconds, corresponding to the value 1, that is, each second corresponds to the Julian Day value
  • the sampling interval of each time series is 15 seconds.
  • the Julian day difference between two valid data points is Record it as Then the time stamp Ts(i...j) of a certain data sub-sequence can be quickly calculated by the following formula:
  • the missing value processing operation is divided into two steps. First, through the threshold setting of the large interval, all data greater than 8 points (corresponding to 0.001389 days) are interrupted and sliced as completely independent time series fragments. , Because continuous missing data greater than this length cannot be processed by interpolation completion, removing large segments of data missing (longer missing on different days or within the same day) can greatly reduce the proportion of null values in subsequent establishments and reduce data processing Complexity.
  • performing missing value processing, white noise processing, and outlier processing on the data further includes: performing wavelet decomposition on the time series segment after the missing value processing to obtain the wavelet of each segment Coefficient; threshold processing is performed on the wavelet coefficient of each segment, so as to perform noise reduction processing on segments with wavelet coefficients less than the second preset threshold.
  • the embodiment of the present invention can use the wavelet transform to decompose the data into representations of different frequencies, and perform threshold processing on the coefficients of different frequencies, thereby The wavelet domain coefficients of smaller amplitude can be processed in a targeted manner to achieve the effect of noise reduction.
  • the denoising operation by wavelet transform can be divided into four steps.
  • the embodiment of the present invention uses Symlets6 as the wavelet basis function to perform the line discrete dyadic wavelet transform.
  • the signal is decomposed into two levels to keep all low-frequency signals.
  • the Garrote threshold method is used for filtering and scaling to reduce noise.
  • the filtering threshold ⁇ is calculated using the general threshold method VisuShrink. specifically:
  • the noise level of the wavelet frequency coefficient sequence needs to be considered That is, the standard deviation of the wavelet domain coefficients is used in the calculation of VisuShrink.
  • the formula is as follows:
  • J is the number of dyadic wavelet transforms, namely It is the median of the absolute value of all wavelet coefficients after wavelet transformation at the J-1 scale divided by 0.6745, where 0.6745 is a predefined hyperparameter in the VisuShrink algorithm.
  • the filtering threshold ⁇ can be defined as follows:
  • N is the length of the signal sequence data.
  • the embodiment of the present invention processes wavelet coefficients of different sizes:
  • ⁇ j,k are the original wavelet coefficients that have undergone wavelet transformation at time k on scale j before filtering, Represents the filtered wavelet coefficients under the corresponding conditions.
  • performing missing value processing, white noise processing, and outlier processing on the data further includes: determining the noise intensity according to the difference value between the wavelet transform and the original data; and determining the noise intensity exceeding the random error The threshold system noise is invalidated.
  • the embodiment of the present invention uses the difference between the wavelet transform and the original data to measure the noise intensity. Due to the presence of noise, the median can be used to estimate the noise distribution, assuming that the noise is ( ⁇ , Under the premise that ⁇ ) is the normal distribution of the mean variance, the noise intensity is defined as twice the standard deviation when the normal distribution accounts for 95%, that is, 1.96*2 ⁇ , filtered outside the range of ( ⁇ -3 ⁇ , ⁇ +3 ⁇ ) The data as outliers. After checking the normal distribution table, we can know that when the median, that is, the proportion of data is about 0.5, and the data is distributed in ( ⁇ -0.68 ⁇ , ⁇ +0.68 ⁇ ), there is an estimate of the standard error of the data ⁇ *:
  • step S102 a turning point in the time series meeting the preset condition is detected, so as to divide the time series meeting the preset condition into a plurality of segmented sub-sequences.
  • PIP Perceptual key points
  • the distance measurement method used in the embodiment of the present invention is a vertical distance measurement method.
  • the embodiment of the present invention uses the distance and noise intensity comparison when each key point is generated to determine the stop time, ensuring that the segmentation process can compress the data to the greatest extent while retaining the complete data change characteristics. Among them, stop when d ⁇ 2*1.96 * .
  • step S103 the obtained sequence segment is searched according to the start and end points of each segmented subsequence, and the DTW algorithm is used for matching to obtain a matching result.
  • the DTW algorithm before matching is performed by the DTW algorithm, it further includes: performing Z-Score standardization on each segmented subsequence, so that the matching template is matched to the signal of the target amplitude.
  • the embodiment of the present invention needs to perform Z-Score standardization on the subsequence so that the template can be matched to signals of different amplitudes, and at the same time use the threshold setting based on signal amplitude and noise intensity 2*1.96 * , ignoring all overall signals Subsequence with too little change.
  • the embodiment of the present invention uses the DTW algorithm for matching.
  • the DTW algorithm is still considered to be the best time series similarity calculation method, which solves the problem of time domain offset.
  • those skilled in the art can also select other algorithms for matching according to actual conditions, which are only used as examples here and are not specifically limited.
  • Fig. 5 is an example of a successful matching.
  • the offline algorithm of the embodiment of the present invention finds a real star flare that the online algorithm does not find. It can be seen from the figure that the total length of the sequence is only 89, and the data before it is invalid. It is impossible to detect such abnormal light changes through the accumulation of effective values.
  • This is an online statistical warning method based on window What cannot be done is also the advantage of the active pattern matching exception algorithm in the embodiment of the present invention.
  • This segment of data undergoes a noise reduction process, and the calculated noise intensity is 0.0605, which is a sequence with good signal continuity, high signal-to-noise ratio, and low noise level. Therefore, 89 points are divided into 5 sequence segments, which makes the data in each data segment relatively stable.
  • the embodiment of the present invention does not need to accurately find the minimum value of the best match.
  • the similarity between the most characteristic fragment and the query template can be searched by segmentation, and then filtered by the sequence length threshold and amplitude threshold, and the search scope can be narrowed by the necessary conditions for sequence matching. In the end, only 8 combinations of the possible 3916 subsequence combinations are calculated, and a good matching result with the template similarity is obtained, that is, the subsequences composed of the 2-4 segments.
  • the three sub-pictures in Fig. 5 are sub-picture one, sub-picture two, and sub-picture three in order from top to bottom.
  • sub-picture one is the sequence after noise reduction, and the solid line indicates the matching result;
  • the offset correspondence between the template and the sequence being queried by DTW is shown in sub-figure 2, and the corresponding matching data points are connected by a black dashed line; sub-figure 3, it can be seen that the DTW algorithm can well cope with the time-domain offset problem.
  • the brightness of the first half of the template is matched with the sharp rise of the brightness, and the second half is adapted for translation.
  • the real data set used for testing in the embodiment of the present invention is data from a GWAC telescope for several days in different regions, including 1,006,078 stars and approximately 269 million data points. After the missing value is processed, the data with more than 20 points is also restricted. After the above filtering, a total of 253 million data points were finally obtained, which entered the process of data validity verification, noise reduction and subsequent analysis and judgment. Since the curve of each star may be divided into multiple, there are 1084269 light curves that meet the density threshold for 253 million data points, with an average length of 233.04, which includes two flaring light changes. In the full test, the serial version took 3336.68s, and the algorithm was able to find all two flares.
  • the method for identifying abnormal patterns of time series based on fuzzy matching proposed in the embodiments of the present invention, it can tolerate the fuzzy matching of samples in different situations under the influence of time domain, amplitude, offset, and noise. It is not necessary to precisely define all abnormal patterns.
  • a small number of templates can be used to search for the pattern fragments of interest; by using the noise in the data, the signal-to-noise ratio is defined for the segmentation of time series data and the corresponding acceleration mode optimization, so as to achieve the appropriate scale and effectiveness.
  • the purpose of the sequence pattern search is not necessary to precisely define all abnormal patterns.
  • Fig. 6 is a schematic structural diagram of a time series abnormal pattern recognition device based on fuzzy matching according to an embodiment of the present invention.
  • the time series abnormal pattern recognition device 10 based on fuzzy matching includes: a processing module 100, a detection module 200 and a matching module 300.
  • the processing module 100 is used to perform missing value processing, white noise processing, and outlier processing on the original time series to obtain a time series meeting preset conditions.
  • the detection module 200 is configured to detect turning points in the time series that meet the preset conditions, so as to divide the time series that meet the preset conditions into multiple segmented sub-sequences.
  • the matching module 300 is used to search for the sequence segments obtained according to the start and end points of each segmented subsequence, and perform matching through the DTW algorithm to obtain a matching result.
  • the device 10 of the embodiment of the present invention may not need to accurately define all abnormal patterns, and only a few templates can search for interesting pattern fragments, and can achieve the purpose of searching for sequential patterns on the basis of appropriate scale and effectiveness.
  • the processing module 100 is further configured to perform data interruption slicing on the original time series according to the threshold value of the preset interval to obtain a plurality of independent time series fragments, and eliminate the overall density lower than the first time series. Preset threshold time series fragments, and perform linear interpolation processing on the remaining time series fragments to obtain time series fragments after missing value processing.
  • the processing module 100 is further configured to perform wavelet decomposition on the time series segment after the missing value processing to obtain the wavelet coefficients of each segment, and perform threshold processing on the wavelet coefficients of each segment , To perform noise reduction processing on the segments whose wavelet coefficients are less than the second preset threshold.
  • the processing module 100 is further configured to determine the noise intensity according to the difference value between the wavelet transform and the original data, and perform invalid processing on the system noise whose noise intensity exceeds the random error threshold.
  • the device 10 of the embodiment of the present invention further includes: a standardized module.
  • the standardization module is used to perform Z-Score standardization on each segmented subsequence before matching by the DTW algorithm, so that the matching template is matched to the signal of the target amplitude.
  • the time series abnormal pattern recognition device based on fuzzy matching proposed according to the embodiment of the present invention can tolerate the fuzzy matching of samples in different situations under the influence of time domain, amplitude, offset, and noise. It is not necessary to precisely define all abnormal patterns.
  • a small number of templates can be used to search for the pattern fragments of interest; by using the noise in the data, the signal-to-noise ratio is defined for the segmentation of time series data and the corresponding acceleration mode optimization, so as to achieve the appropriate scale and effectiveness.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present invention, "a plurality of” means at least two, such as two, three, etc., unless otherwise specifically defined.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Automation & Control Theory (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种基于模糊匹配的时间序列异常模式识别方法及装置,其中,方法包括以下步骤:对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列(S101);检测满足预设条件的时间序列中的转折点,以将满足预设条件的时间序列分为多个节段化子序列(S102);根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果(S103)。该方法可以不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段,且可以实现在合适尺度和有效性基础上进行序列模式搜索的目的。

Description

基于模糊匹配的时间序列异常模式识别方法及装置
相关申请的交叉引用
本申请要求清华大学于2019年07月24日提交的、发明名称为“基于模糊匹配的时间序列异常模式识别方法及装置”的、中国专利申请号“201910673710.3”的优先权。
技术领域
本发明涉及数据处理技术领域,特别涉及一种基于模糊匹配的时间序列异常模式识别方法及装置。
背景技术
在时域天文学中,需要处理大量的星等亮度时间序列数据,其中诸多的光变亮度异常代表着有研究价值的天文现象,例如,微引力透镜现象和恒星耀发现象,二者的光变都有较为准确的数学模型描述。但是,相关技术中还无法的对诸多的光变亮度异常进行有效的识别。
发明内容
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。
为此,本发明的一个目的在于提出一种基于模糊匹配的时间序列异常模式识别方法,该方法可以实现在合适尺度和有效性基础上进行序列模式搜索的目的。
本发明的另一个目的在于提出一种基于模糊匹配的时间序列异常模式识别装置。
为达到上述目的,本发明一方面实施例提出了一种基于模糊匹配的时间序列异常模式识别方法,包括以下步骤:对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列;检测所述满足预设条件的时间序列中的转折点,以将所述满足预设条件的时间序列分为多个节段化子序列;根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW(Dynamic Time Warping,动态时间归整)算法进行匹配,得到匹配结果。
本发明实施例的基于模糊匹配的时间序列异常模式识别方法,可以容忍样本在时域、幅值以及偏移、噪声影响下不同情况的模糊匹配,不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段;通过利用数据中的噪声,定义信噪比进行时间序列数据节段化处理以及相应的加速模式优化,从而实现在合适尺度和有效性基础上进行 序列模式搜索的目的。
另外,根据本发明上述实施例的基于模糊匹配的时间序列异常模式识别方法还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述对数据进行缺失值处理、白噪声处理及离群点处理,包括:根据预设间隔的阈值对所述原始时间序列进行数据中断切片,得到多个独立的时间序列片段;剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺失值处理后的时间序列片段。
进一步地,在本发明的一个实施例中,所述对数据进行缺失值处理、白噪声处理及离群点处理,还包括:将所述缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数;对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
进一步地,在本发明的一个实施例中,所述对数据进行缺失值处理、白噪声处理及离群点处理,还包括:根据小波变换和原始数据的差异值确定噪声强度;对所述噪声强度超过随机误差阈值的***噪声进行无效处理。
进一步地,在本发明的一个实施例中,在通过所述DTW算法进行匹配之前,还包括:对所述每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
为达到上述目的,本发明另一方面实施例提出了一种基于模糊匹配的时间序列异常模式识别装置,包括:处理模块,用于对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列;检测模块,用于检测所述满足预设条件的时间序列中的转折点,以将所述满足预设条件的时间序列分为多个节段化子序列;匹配模块,用于根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果。
本发明实施例的基于模糊匹配的时间序列异常模式识别装置,可以容忍样本在时域、幅值以及偏移、噪声影响下不同情况的模糊匹配,不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段;通过利用数据中的噪声,定义信噪比进行时间序列数据节段化处理以及相应的加速模式优化,从而实现在合适尺度和有效性基础上进行序列模式搜索的目的。
另外,根据本发明上述实施例的基于模糊匹配的时间序列异常模式识别装置还可以具有以下附加的技术特征:
进一步地,在本发明的一个实施例中,所述处理模块进一步用于根据预设间隔的阈值对所述原始时间序列进行数据中断切片,得到多个独立的时间序列片段,并剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺 失值处理后的时间序列片段。
进一步地,在本发明的一个实施例中,所述处理模块进一步用于将所述缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数,并对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
进一步地,在本发明的一个实施例中,所述对处理模块进一步用于根据小波变换和原始数据的差异值确定噪声强度,并对所述噪声强度超过随机误差阈值的***噪声进行无效处理。
进一步地,在本发明的一个实施例中,还包括:标准化模块,在通过所述DTW算法进行匹配之前,用于对所述每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
本发明附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1为根据本发明实施例的基于模糊匹配的时间序列异常模式识别方法的流程图;
图2为根据本发明实施例的基于密度和绝对缺失长度进行序列切分和筛选示意图;
图3为根据本发明实施例的使用小波变换进行降噪的步骤示意图;
图4为根据本发明实施例的添加新的关键点时垂直PIP距离度量定义示意图;
图5为根据本发明实施例的通过模糊匹配算法发现的一次恒星耀发示意图;
图6为根据本发明实施例的基于模糊匹配的时间序列异常模式识别装置的结构示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本发明,而不能理解为对本发明的限制。
在介绍基于模糊匹配的时间序列异常模式识别方法及装置之前,先给出模式异常的定义,具体如下:
本发明实施例可以通过定义异常模式,并在序列中搜索获得真实的异常样本,进行深入的研究。其中,给出模式异常的定义:对于S1,…,Sn等多个连续的光亮度片段,其在给定的特征中与本发明实施例感兴趣的模式在变化的过程相似度较高,可以认为是同一类的 天文学现象,则可认定该段连续的星等值是本发明实施例的搜索目标。这里的相似度定义、给定的模板和相应的相似度度量方法体现出本发明实施例对目标模式的倾向。
下面参照附图描述根据本发明实施例提出的基于模糊匹配的时间序列异常模式识别方法及装置,首先将参照附图描述根据本发明实施例提出的基于模糊匹配的时间序列异常模式识别方法。
图1是本发明一个实施例的基于模糊匹配的时间序列异常模式识别方法的流程图。
如图1所示,该基于模糊匹配的时间序列异常模式识别方法包括以下步骤:
在步骤S101中,对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列。
其中,满足预设条件的时间序列可以理解为没有异常单个离散点异常、且随机噪声水平相对低且稳定的连续光亮度点。
需要说明的是,本发明实施例的方法是对模式异常的主动搜索匹配,主要包括两部分:对数据的预处理和对候选子序列的筛选和相似度匹配。真实数据中存在多样的数据缺失、离群点异常以及随机噪声,其中,数据缺失会使得匹配算法失效,仪器以及异常因素影响造成的离群点会使得匹配失效,随机噪声会影响本发明实施例对数据片段相似度度量的准确性。因此,本发明实施例通过对应的手段解决三个问题,保证了进入后续剪枝匹配阶段的数据是没有异常单个离散点异常、且随机噪声水平相对低且稳定的连续光亮度点。
进一步地,在本发明的一个实施例中,对数据进行缺失值处理、白噪声处理及离群点处理,包括:根据预设间隔的阈值对原始时间序列进行数据中断切片,得到多个独立的时间序列片段;剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺失值处理后的时间序列片段。
具体而言,在进行缺失值处理时,对于一段长度为n的原始时间序列y的存在密度定义如下:
Figure PCTCN2019099467-appb-000001
其中,is_valid函数为判断该数据点是否有数值的函数,若有值则为1,反之为0。
在实际的GWAC光变曲线序列数据集中,本发明实施例没有保存所有时刻的数据,而仅仅保存了有效的数据时间戳(儒略日JD的浮点数值)序列和对应的星等亮度值。其中,儒略日定义一天为86400秒,对应数值1,即每一秒对应儒略日数值的
Figure PCTCN2019099467-appb-000002
每一个时间序列的采样间隔为15秒,根据儒略日定义,两个有效的数据点之间的儒略日差值为
Figure PCTCN2019099467-appb-000003
将其记为
Figure PCTCN2019099467-appb-000004
则某一段数据子序列时间戳Ts(i...j)可以通过以下公式快速计算:
Figure PCTCN2019099467-appb-000005
进一步地,如图2所示,缺失值处理操作分为两步,首先通过大间隔的阈值设置,将所有大于8个点(对应0.001389天)的数据中断切片,当作完全独立的时间序列片段,因为连续缺失大于该长度的数据无法通过插值补全进行后续的处理,将大段的数据缺失(不同天或者同一天内较长缺失)剔除可以极大地减少后续建树的空值比例,降低数据处理的复杂度。
然后,对于每个片段内部的缺失值,根据其总体密度进行进一步判断,将整体密度低于某阈值的序列片段剔除,保留高于某阈值的序列片段,对于达到密度的片段,则在后续的过程中进行线性的插值处理。最终得到的数据是绝对长度符合要求、没有连续的大段缺失、而且总体的数据缺失比例也符合要求的序列片段。
进一步地,在本发明的一个实施例中,对数据进行缺失值处理、白噪声处理及离群点处理,还包括:将缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数;对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
具体而言,在进行白噪声处理时,由于在真实的时间序列中,有效的信号在时域有连续性的白噪声,将其经小波分解的时间序列信号得到的小波系数,其中正常信号在小波域的幅值要大于噪声在小波域的幅值,因此,在进行变换时,本发明实施例可以使用小波变换将数据分解为不同频率的表示,并对不同频率的系数进行阈值处理,从而可以针对性地处理较小幅值的小波域系数,以达到降噪效果。
进一步地,如图3所示,通过小波变换进行去噪的操作可以分为四个步骤,本发明实施例使用Symlets6作为小波基函数,进行行离散二进小波变换,为提高变换的速率,可以将信号进行二级分解,保留低频的所有信号,对于高频的所有信号,使用Garrote阈值法进行过滤和缩放,进行降噪,过滤的阈值λ的计算采用通用阈值法VisuShrink。具体地:
首先,需要顾及小波频率系数序列的噪声等级
Figure PCTCN2019099467-appb-000006
即小波域系数的标准差,用于VisuShrink的计算。其公式如下:
Figure PCTCN2019099467-appb-000007
其中,J为二进小波变换的次数,即
Figure PCTCN2019099467-appb-000008
是J-1尺度下小波变换后的所有小波系数绝对值的中位数除以0.6745后的值,其中0.6745是VisuShrink算法中预定义的超参数。
由此,可以定义过滤的阈值λ如下:
Figure PCTCN2019099467-appb-000009
其中,N为信号序列数据的长度。
然后,针对以上的阈值,本发明实施例对不同大小的小波系数进行处理:
Figure PCTCN2019099467-appb-000010
其中,ωj,k为原始的、过滤之前在j尺度上,k时间处经过小波变换的小波系数,
Figure PCTCN2019099467-appb-000011
表示对应条件下过滤后的小波系数。
进一步地,在本发明的一个实施例中,对数据进行缺失值处理、白噪声处理及离群点处理,还包括:根据小波变换和原始数据的差异值确定噪声强度;对噪声强度超过随机误差阈值的***噪声进行无效处理。
具体而言,在进行离群点处理及数据有效性验证时,在真实的光变曲线数据中,存在许多自然意外带来的无效数据,例如,由于天空中云层变化、月光影响或其它外界不可控的自然因素,导致数据无效。超过随机误差阈值的***噪声可以分为两类:1、光变曲线的随机噪声强度过高,导致时间序列出现高频的均值波动和方差波动。2、有少数点出现离群现象,即其绝对值偏移该段数据均值过大。
两类现象对应不同的处理手段,本发明实施例使用小波变换和原始数据二者的差异衡量噪声强度,由于噪声的存在,可以使用中位数进行噪声分布的估计,在假设噪声为(μ,σ)为均值方差的正态分布的前提下,定义噪声强度为正态分布占比95%时的标准差的二倍,即1.96*2σ,过滤在(μ-3σ,μ+3σ)范围外的数据作为离群点。经过查正态分布表,可以得知中位数即数据占比约为0.5时,数据分布在(μ-0.68σ,μ+0.68σ)中,则有对数据标准误差的估计σ*:
Figure PCTCN2019099467-appb-000012
在步骤S102中,检测满足预设条件的时间序列中的转折点,以将满足预设条件的时间序列分为多个节段化子序列。
具体而言,如图4所示,进行数据节段化处理时,通过检测时间序列中的转折点来将序列分为多个变化较为稳定连续、特征较为明显的片段。目前检测关键点的技术发展的较为成熟,感知型关键点(PIP)是一个由上而下的时间序列分片算法,采用贪婪策略。它首先将整个序列先视作为一个片段,而后通过迭代寻找所有片段中,与片段两个端点距离最大的点,并将其设置为新的关键点,将当前片段一分为二来更新新的片段。其中,本发明实施例使用的距离度量方法为垂直距离度量方法。
进一步地,本发明实施例使用每个关键点生成时的距离和噪声强度比较来确定停止时刻,保证节段化的过程能够最大程度地压缩数据同时保留完整的数据变化特征。 其中,当d≤2*1.96 *时停止。
在步骤S103中,根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果。
进一步地,在本发明的一个实施例中,在通过DTW算法进行匹配之前,还包括:对每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
具体而言,在进行子序列的模糊匹配时,对于得到的节段化子序列,可以认为所有突变的信号都是以每个节段为开始或结束的,故而可以将搜索的起始和结束点设置为通过关键点查找得到的序列节段,从而极大地加速了匹配过程。同时,本发明实施例需要对子序列进行Z-Score标准化,使得模板可以匹配到不同幅值的信号,并同时使用基于信号幅值和噪声强度2*1.96 *的阈值设定,忽略所有总体信号变化太小的子序列。对于匹配过程,本发明实施例使用DTW算法进行匹配,DTW算法目前仍然认为是最佳的时间序列相似度计算方法,解决了时域偏移的问题。当然,本领域技术人员还可以根据实际情况选择其他的算法进行匹配,在此仅作为示例,不做具体限定。
下面将通过一个具体示例对基于模糊匹配的时间序列异常模式识别方法进行进一步阐述。
图5即是一次匹配成功的例子,如图5所示,本发明实施例的离线算法找到了一处在线算法没有找到的真实恒星耀发。从图中可以看出,该序列总长度仅有89,在其之前的数据均为无效数据,不能够通过有效值积累探测如此的异常光变,这是在线的基于窗又的统计学预警方法所无法完成的,也是本发明实施例主动的模式匹配异常算法的优势所在。该段数据经过降噪过程,计算得到了噪声强度为0.0605,属于信号连续性较好,信噪比较高,噪声水平很低的序列。故而89个点分为了5个序列节段,使得每个数据段内的数据都比较稳定。
对于89个点,取其中任意两个点作为子序列搜索的起止点,可以有89(89-1)=3916个组合,本发明实施例不需要准确地求出其中最佳匹配的最小值,而可以通过节段化搜索其最有特征片段和查询模板之间的相似度,再经过序列长度阈值、幅值阈值的过滤后,通过序列匹配出现的必要条件缩小搜索范围。最终仅仅计算可能的3916子序列组合中的8个组合,并且获得了和模板相似度一个较好的匹配结果,即第2-4个节段构成的子序列。
另外,图5中的三幅子图从上至下依次为子图一、子图二、子图三,其中,子图一是降噪后的序列,实线表示其匹配的到的结果;DTW对模板和被查询序列的偏移对应情况如子图二所示,对应的匹配数据点使用黑色虚线连接;子图三可以看出DTW算法很好地应对了时域偏移的问题,将模板前半段的亮度剧烈上升的过程相匹配,而后半段进行相应的平移适应。
进一步地,本发明实施例进行测试的真实数据集,是GWAC一台望远镜对部分不同天区多天的数据,包含1006078颗星,约2.69亿个数据点。在经过缺失值处理后,同时限定点数大于20的数据。经过以上的过滤,最终共得到2.53亿个数据点,进入数据有效性验证、降噪以及后续的分析判断过程。由于每颗星的曲线可能被切分为多个,2.53亿个数据点共有1084269条符合密度阈值的光变曲线,其平均长度为233.04,其中包含两次耀发的光变。在全量测试中,串行版本耗时3336.68s,同时算法均能够找到全部两个耀发现象。
根据本发明实施例提出的基于模糊匹配的时间序列异常模式识别方法,可以容忍样本在时域、幅值以及偏移、噪声影响下不同情况的模糊匹配,不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段;通过利用数据中的噪声,定义信噪比进行时间序列数据节段化处理以及相应的加速模式优化,从而实现在合适尺度和有效性基础上进行序列模式搜索的目的。
其次参照附图描述根据本发明实施例提出的基于模糊匹配的时间序列异常模式识别装置。
图6是本发明一个实施例的基于模糊匹配的时间序列异常模式识别装置的结构示意图。
如图6所示,该基于模糊匹配的时间序列异常模式识别装置10包括:处理模块100、检测模块200和匹配模块300。
其中,处理模块100用于对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列。检测模块200用于检测满足预设条件的时间序列中的转折点,以将满足预设条件的时间序列分为多个节段化子序列。匹配模块300用于根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果。本发明实施例的装置10可以不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段,且可以实现在合适尺度和有效性基础上进行序列模式搜索的目的。
进一步地,在本发明的一个实施例中,处理模块100进一步用于根据预设间隔的阈值对原始时间序列进行数据中断切片,得到多个独立的时间序列片段,并剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺失值处理后的时间序列片段。
进一步地,在本发明的一个实施例中,处理模块100进一步用于将缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数,并对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
进一步地,在本发明的一个实施例中,对处理模块100进一步用于根据小波变换和原始数据的差异值确定噪声强度,并对噪声强度超过随机误差阈值的***噪声进行无效处理。
进一步地,在本发明的一个实施例中,本发明实施例的装置10还包括:标准化模块。其中,标准化模块在通过DTW算法进行匹配之前,用于对每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
需要说明的是,前述对基于模糊匹配的时间序列异常模式识别方法实施例的解释说明也适用于该实施例的基于模糊匹配的时间序列异常模式识别装置,此处不再赘述。
根据本发明实施例提出的基于模糊匹配的时间序列异常模式识别装置,可以容忍样本在时域、幅值以及偏移、噪声影响下不同情况的模糊匹配,不需要精确定义所有异常的模式,仅通过少量模板即可搜索到感兴趣的模式片段;通过利用数据中的噪声,定义信噪比进行时间序列数据节段化处理以及相应的加速模式优化,从而实现在合适尺度和有效性基础上进行序列模式搜索的目的。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本发明的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (10)

  1. 一种基于模糊匹配的时间序列异常模式识别方法,其特征在于,包括以下步骤:
    对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列;
    检测所述满足预设条件的时间序列中的转折点,以将所述满足预设条件的时间序列分为多个节段化子序列;以及
    根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果。
  2. 根据权利要求1所述的方法,其特征在于,所述对数据进行缺失值处理、白噪声处理及离群点处理,包括:
    根据预设间隔的阈值对所述原始时间序列进行数据中断切片,得到多个独立的时间序列片段;
    剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺失值处理后的时间序列片段。
  3. 根据权利要求2所述的方法,其特征在于,所述对数据进行缺失值处理、白噪声处理及离群点处理,还包括:
    将所述缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数;
    对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
  4. 根据权利要求3所述的方法,其特征在于,所述对数据进行缺失值处理、白噪声处理及离群点处理,还包括:
    根据小波变换和原始数据的差异值确定噪声强度;
    对所述噪声强度超过随机误差阈值的***噪声进行无效处理。
  5. 根据权利要求1所述的方法,其特征在于,在通过所述DTW算法进行匹配之前,还包括:
    对所述每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
  6. 一种基于模糊匹配的时间序列异常模式识别装置,其特征在于,包括:
    处理模块,用于对原始时间序列进行缺失值处理、白噪声处理及离群点处理,以得到满足预设条件的时间序列;
    检测模块,用于检测所述满足预设条件的时间序列中的转折点,以将所述满足预设条件的时间序列分为多个节段化子序列;以及
    匹配模块,用于根据每个节段化子序列的起始和结束点查找得到的序列节段,并通过DTW算法进行匹配,得到匹配结果。
  7. 根据权利要求6所述的装置,其特征在于,所述处理模块进一步用于根据预设间隔的阈值对所述原始时间序列进行数据中断切片,得到多个独立的时间序列片段,并剔除整体密度低于第一预设阈值的时间序列片段,并对剩余的时间序列片段进行线性插值处理,得到缺失值处理后的时间序列片段。
  8. 根据权利要求7所述的装置,其特征在于,所述处理模块进一步用于将所述缺失值处理后的时间序列片段进行小波分解,得到每个片段的小波系数,并对每个片段的小波系数进行阈值处理,以对小波系数小于第二预设阈值的片段进行降噪处理。
  9. 根据权利要求8所述的装置,其特征在于,所述对处理模块进一步用于根据小波变换和原始数据的差异值确定噪声强度,并对所述噪声强度超过随机误差阈值的***噪声进行无效处理。
  10. 根据权利要求1所述的方法,其特征在于,还包括:
    标准化模块,在通过所述DTW算法进行匹配之前,用于对所述每个节段化子序列进行Z-Score标准化,以使匹配模板匹配到目标幅值的信号。
PCT/CN2019/099467 2019-07-24 2019-08-06 基于模糊匹配的时间序列异常模式识别方法及装置 WO2021012315A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910673710.3A CN110427996B (zh) 2019-07-24 2019-07-24 基于模糊匹配的时间序列异常模式识别方法及装置
CN201910673710.3 2019-07-24

Publications (1)

Publication Number Publication Date
WO2021012315A1 true WO2021012315A1 (zh) 2021-01-28

Family

ID=68412269

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/099467 WO2021012315A1 (zh) 2019-07-24 2019-08-06 基于模糊匹配的时间序列异常模式识别方法及装置

Country Status (2)

Country Link
CN (1) CN110427996B (zh)
WO (1) WO2021012315A1 (zh)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114659937A (zh) * 2022-05-20 2022-06-24 扬州惠特科技有限公司 一种再生聚酯聚合釜用在线粘度监测方法
CN115292872A (zh) * 2022-05-30 2022-11-04 中国特种设备检测研究院 一种过山车轨道缺陷定位方法、***、介质及设备
CN116110516A (zh) * 2023-04-14 2023-05-12 青岛山青华通环境科技有限公司 一种污水处理过程异常工况识别方法和装置
CN116644281A (zh) * 2023-07-27 2023-08-25 东营市艾硕机械设备有限公司 一种游艇船体偏移检测方法
CN117176545A (zh) * 2023-11-02 2023-12-05 江苏益捷思信息科技有限公司 基于时间序列分析的数据交换异常检测方法及***
CN117239942A (zh) * 2023-11-16 2023-12-15 天津瑞芯源智能科技有限责任公司 一种具有监控功能的电表
CN118138651A (zh) * 2024-05-06 2024-06-04 江苏天南电力股份有限公司 基于5g技术的输电线路数据远程传输方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925950A (zh) * 2021-01-27 2021-06-08 中国人民大学 一种面向连续星表数据的数据质量控制方法及***
CN112818297B (zh) * 2021-02-05 2024-02-20 国网安徽省电力有限公司合肥供电公司 一种云环境下数据异常检测方法
CN112766426B (zh) * 2021-04-06 2021-09-07 中国铁道科学研究院集团有限公司通信信号研究所 一种基于时间约束的故障类型识别方法
CN112990372B (zh) * 2021-04-27 2021-08-06 北京瑞莱智慧科技有限公司 一种数据处理方法、模型训练方法、装置及电子设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110225196A1 (en) * 2008-03-19 2011-09-15 National University Corporation Hokkaido University Moving image search device and moving image search program
CN104820673A (zh) * 2015-03-27 2015-08-05 浙江大学 基于自适应性分段统计近似的时间序列相似性度量方法
CN105205112A (zh) * 2015-09-01 2015-12-30 西安交通大学 一种时序数据异常特征的挖掘***及方法
US20160019415A1 (en) * 2014-07-17 2016-01-21 At&T Intellectual Property I, L.P. Automated obscurity for pervasive imaging

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069093B (zh) * 2015-08-05 2018-07-24 河海大学 一种基于嵌入式索引的水文时间序列相似性搜索方法
CN106777084B (zh) * 2016-12-13 2020-09-18 清华大学 针对光变曲线在线分析及异常报警的方法及***
CN107402983B (zh) * 2017-07-10 2019-11-22 清华大学 邻近点查询方法及查询装置
CN109993092B (zh) * 2019-03-25 2021-03-16 清华大学 光变异常的实时高灵敏早期识别方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110225196A1 (en) * 2008-03-19 2011-09-15 National University Corporation Hokkaido University Moving image search device and moving image search program
US20160019415A1 (en) * 2014-07-17 2016-01-21 At&T Intellectual Property I, L.P. Automated obscurity for pervasive imaging
CN104820673A (zh) * 2015-03-27 2015-08-05 浙江大学 基于自适应性分段统计近似的时间序列相似性度量方法
CN105205112A (zh) * 2015-09-01 2015-12-30 西安交通大学 一种时序数据异常特征的挖掘***及方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
FENG, HONGWEI : "Fuzzy Set in Time Series Matching", COMPUTER SCIENCE, vol. 29, no. 4, 31 December 2002 (2002-12-31), pages 138 - 140, XP009525601, ISSN: 1002-137X *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114659937A (zh) * 2022-05-20 2022-06-24 扬州惠特科技有限公司 一种再生聚酯聚合釜用在线粘度监测方法
CN115292872A (zh) * 2022-05-30 2022-11-04 中国特种设备检测研究院 一种过山车轨道缺陷定位方法、***、介质及设备
CN116110516A (zh) * 2023-04-14 2023-05-12 青岛山青华通环境科技有限公司 一种污水处理过程异常工况识别方法和装置
CN116110516B (zh) * 2023-04-14 2023-07-21 青岛山青华通环境科技有限公司 一种污水处理过程异常工况识别方法和装置
CN116644281A (zh) * 2023-07-27 2023-08-25 东营市艾硕机械设备有限公司 一种游艇船体偏移检测方法
CN116644281B (zh) * 2023-07-27 2023-10-24 东营市艾硕机械设备有限公司 一种游艇船体偏移检测方法
CN117176545A (zh) * 2023-11-02 2023-12-05 江苏益捷思信息科技有限公司 基于时间序列分析的数据交换异常检测方法及***
CN117176545B (zh) * 2023-11-02 2024-01-26 江苏益捷思信息科技有限公司 基于时间序列分析的数据交换异常检测方法及***
CN117239942A (zh) * 2023-11-16 2023-12-15 天津瑞芯源智能科技有限责任公司 一种具有监控功能的电表
CN117239942B (zh) * 2023-11-16 2024-01-19 天津瑞芯源智能科技有限责任公司 一种具有监控功能的电表
CN118138651A (zh) * 2024-05-06 2024-06-04 江苏天南电力股份有限公司 基于5g技术的输电线路数据远程传输方法

Also Published As

Publication number Publication date
CN110427996A (zh) 2019-11-08
CN110427996B (zh) 2022-03-15

Similar Documents

Publication Publication Date Title
WO2021012315A1 (zh) 基于模糊匹配的时间序列异常模式识别方法及装置
WO2017185963A1 (zh) 一种基于大数据的趋势曲线局部特征的匹配方法及终端
CN110210660B (zh) 一种超短期风速预测方法
CN116610938B (zh) 曲线模式分段的半导体制造无监督异常检测方法及设备
Zhai et al. Recent methods and applications on image edge detection
CN106910495A (zh) 一种应用于异常声音检测的音频分类***和方法
WO2017076189A1 (zh) 一种基于差分窗和模板匹配的otdr事件分析算法
Peng et al. Automated product boundary defect detection based on image moment feature anomaly
CN117251798A (zh) 一种基于两层渐进式的气象设备异常检测方法
CN109829902B (zh) 一种基于广义S变换和Teager属性的肺部CT图像结节筛选方法
CN114563671A (zh) 一种基于CNN-LSTM-Attention神经网络的高压电缆局部放电诊断方法
CN116364108A (zh) 变压器声纹检测方法及装置、电子设备、存储介质
CN115310041A (zh) 一种基于dtw算法解读时间序列局部特征的方法
JP6258574B2 (ja) パッシブソーナー装置、方位集中処理方法、及び、パッシブソーナー信号処理プログラム
Awad et al. Fingerprint singularity detection: A comparative study
Milo et al. Anomaly detection in rolling element bearings via hierarchical transition matrices
CN111504908A (zh) 一种基于光声光谱的岩石类型识别方法及***
Ma et al. Machine vision-based surface inspection system for rebar
Zhang et al. Research on feature extraction and pattern recognition of acoustic signals based on MEMD and approximate entropy
CN118112545B (zh) 基于瑞利拟合的大气激光雷达参考高度提取方法及***
Su et al. Study on algorithm of eyelash occlusions detection based on endpoint identification
CN117648657B (zh) 一种城市规划多源数据优化处理方法
Poirier–Herbeck et al. Unknown-length motif discovery methods in environmental monitoring time series
Stasolla et al. Enhanced Morphological Filtering for Wavelet-Based Changepoint Detection
CN118248169A (zh) 一种基于音频数据的燃气泄漏识别方法以及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19938432

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19938432

Country of ref document: EP

Kind code of ref document: A1