JP6848546B2

JP6848546B2 - Change point detection device and change point detection method

Info

Publication number: JP6848546B2
Application number: JP2017044995A
Authority: JP
Inventors: 大介奥谷
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2017-03-09
Filing date: 2017-03-09
Publication date: 2021-03-24
Anticipated expiration: 2037-03-09
Also published as: JP2018147442A

Description

本発明は、時系列データの異常検知に関し、特に複数の時系列データを含むデータ系列の中から変化点を検出する変化点検出装置および変化点検出方法に関する。 The present invention relates to anomaly detection of time series data, and more particularly to a change point detection device and a change point detection method for detecting a change point from a data series including a plurality of time series data.

時系列データに含まれる異常は、従来から様々な手法によって検知されている（例えば、特許文献１〜７参照）。時系列データの異常検知の手法は、その目的により、時系列データのパターンが変化する部分を検知する「変化点検出」と、通常では起こりえないデータ点を検知する「外れ値検出」と、通常のパターンとは異なる区間を検知する「異常部位検出」と、の３つに大別される。 Abnormalities contained in time series data have been conventionally detected by various methods (see, for example, Patent Documents 1 to 7). The methods for detecting abnormalities in time-series data are, depending on the purpose, "change point detection" that detects the part where the pattern of time-series data changes, and "outlier detection" that detects data points that cannot normally occur. It is roughly divided into three types: "abnormal part detection" that detects a section different from the normal pattern.

特許文献１および２には、変化点検出に関する技術が開示されている。特許文献１の変化点検出装置は、自己回帰モデルに基づく忘却型学習アルゴリズムによって、データ系列の変化の度合いをスコア化し、スコアが閾値を超えた時刻を変化点として検出するものである。また、特許文献２の異常値検出装置は、ベクトルの類似度比較を用いて、最新の時系列データが異常値であるか否かを検出するものである。 Patent Documents 1 and 2 disclose techniques for detecting change points. The change point detection device of Patent Document 1 scores the degree of change in the data series by a failure-oblivious learning algorithm based on an autoregressive model, and detects the time when the score exceeds the threshold value as the change point. Further, the outlier detection device of Patent Document 2 detects whether or not the latest time series data is an outlier by using vector similarity comparison.

特許文献３および４に記載の技術は、外れ値検出を目的としたものである。特許文献３の状態変化警報装置は、カオス推論により、過去のデータ系列から予測値を算出し、予測値から大きく外れる時系列データを外れ値として検出するものである。また、特許文献４の異常検出装置は、統計値および微分値などに基づく複数の異常判定基準を設定しておき、すべての異常判定基準を満たす時系列データを外れ値として検出するものである。 The techniques described in Patent Documents 3 and 4 are aimed at detecting outliers. The state change alarm device of Patent Document 3 calculates a predicted value from a past data series by chaos inference, and detects time series data that greatly deviates from the predicted value as an outlier. Further, the abnormality detection device of Patent Document 4 sets a plurality of abnormality determination criteria based on statistical values, differential values, and the like, and detects time-series data satisfying all the abnormality determination criteria as outliers.

特許文献５〜７に記載の技術は、異常部位検出を目的として、各々の手法により通常のパターンとの違いを検出している。特許文献５の異常検出装置は、動的時間伸縮法に基づく手法を用いることにより、区間の時間的な伸縮に対応している。特許文献６の異常検知装置は、２つのデータ系列から得られる変化量の差に基づいて、異常部位を検出している。特許文献７の異常検出システムは、グルーピングされたデータ系列から算出される統計量を比較することにより、異常部位を検出している。 The techniques described in Patent Documents 5 to 7 detect a difference from a normal pattern by each method for the purpose of detecting an abnormal portion. The abnormality detection device of Patent Document 5 corresponds to the temporal expansion and contraction of the section by using a method based on the dynamic time expansion and contraction method. The abnormality detection device of Patent Document 6 detects an abnormality portion based on the difference in the amount of change obtained from the two data series. The abnormality detection system of Patent Document 7 detects an abnormality portion by comparing statistics calculated from a grouped data series.

特許第４２６５２９６号公報Japanese Patent No. 4265296 特許第４４６８１３１号公報Japanese Patent No. 4468131 特許第３６５９７２３号公報Japanese Patent No. 3659723 特許第５８７５４３０号公報Japanese Patent No. 5875430 特開２０１２−５９１９８号公報Japanese Unexamined Patent Publication No. 2012-59198 特許第５６６９５５３号公報Japanese Patent No. 5669553 特許第５３１００９４号公報Japanese Patent No. 5310094

しかしながら、上述の変化点検出および外れ値検出に関して、特許文献１および３に記載の技術では、過去のデータ系列の遷移から異常を捕捉しようとしており、特許文献２および４に記載の技術では、過去の時系列データの集合の統計量又は変位から異常を捕捉しようとしている。すなわち、特許文献１〜４に記載の技術では、過去の時系列データを一定量蓄積した後でなければ、異常を検知することが困難である。仮に、充分に時系列データを蓄積していない短いデータ系列に対して、各々の手法を適用することを考えると、特許文献１及び３に記載の技術では、モデル構築の段階で妥当な予測モデルが生成されない。また、特許文献２および４に記載の技術では、過去の時系列データの統計量又は変異の傾向を捕捉することができない。したがって、特許文献１〜４に記載の技術では、特に、時系列データの蓄積量が充分でない状況において、本来の変化点でのデータの取りこぼしが生じる、という課題がある。 However, regarding the above-mentioned change point detection and outlier detection, the techniques described in Patent Documents 1 and 3 try to capture anomalies from the transition of past data series, and the techniques described in Patent Documents 2 and 4 have past. I am trying to capture anomalies from the statistic or displacement of a set of time series data. That is, with the techniques described in Patent Documents 1 to 4, it is difficult to detect an abnormality only after a certain amount of past time series data has been accumulated. Considering that each method is applied to a short data series in which time series data is not sufficiently accumulated, the techniques described in Patent Documents 1 and 3 are appropriate prediction models at the model construction stage. Is not generated. In addition, the techniques described in Patent Documents 2 and 4 cannot capture the statistic of past time series data or the tendency of mutation. Therefore, the techniques described in Patent Documents 1 to 4 have a problem that data is missed at the original change point, particularly in a situation where the amount of time-series data accumulated is not sufficient.

同様に、特許文献５〜７に記載の技術では、過去のデータ系列の特定部位を通常のパターンとし、通常のパターンから得られる特徴量を利用して異常を捕捉するため、過去の時系列データの一定量の蓄積が不可欠となる。例えば、各種のセンサが検出した時系列データから異常を検出する場合、多くのセンサで秒間数個から数千個以上のデータがサンプリングされるため、通常のパターンとして利用できる過去の時系列データが潤沢にある。しかしながら、月間数件程度のエラーログなどにおける異常検知を対象とした場合、特許文献５〜７に記載された手法の適用は困難である。よって、時系列データが充分に蓄積されている場合はもとより、時系列データの蓄積量が少ない場合においても、異常検知を精度よく行う変化点検出装置および変化点検出方法が望まれている。 Similarly, in the techniques described in Patent Documents 5 to 7, the specific part of the past data series is set as a normal pattern, and the feature amount obtained from the normal pattern is used to capture the abnormality, so that the past time series data Accumulation of a certain amount is indispensable. For example, when anomalies are detected from time-series data detected by various sensors, many sensors sample several to several thousand or more data per second, so past time-series data that can be used as a normal pattern can be used. It is abundant. However, it is difficult to apply the methods described in Patent Documents 5 to 7 when anomaly detection is performed in error logs of several cases per month. Therefore, there is a demand for a change point detection device and a change point detection method that accurately detect anomalies not only when the time series data is sufficiently accumulated but also when the amount of time series data accumulated is small.

本発明に係る変化点検出装置は、外部から入力される時系列データをもとに自己回帰モデルによる推定値を算出する自己回帰モデル算出部と、時系列データの新しさに応じて相対的に重みが増えるよう、時系列データに重み付け処理を施して加重データを作成する加重データ作成部と、時系列データとして、加重データ作成部において作成された加重データを用いてマハラノビス距離を算出するマハラノビス距離算出部と、自己回帰モデル算出部において算出された推定値と、マハラノビス距離算出部において算出されたマハラノビス距離とを統合して解析することにより、時系列データの変化の大小をスコアとして換算して出力する変化点スコア算出部と、を有するものである。 The change point detection device according to the present invention has a self-return model calculation unit that calculates an estimated value by a self-return model based on time-series data input from the outside, and a relative to the newness of the time-series data. A weighted data creation unit that creates weighted data by weighting time-series data so that the weight increases, and a maharanobis distance that calculates the maharanobis distance using the weighted data created by the weighted data creation unit as time-series data. By integrating and analyzing the estimated value calculated by the calculation unit, the self-return model calculation unit, and the Maharanobis distance calculated by the Maharanobis distance calculation unit, the magnitude of the change in the time series data is converted as a score. It has a change point score calculation unit to be output.

また、本発明に係る変化点検出方法は、外部から入力される時系列データをもとに自己回帰モデルによる推定値を算出する自己回帰モデル算出ステップと、時系列データの新しさに応じて相対的に重みが増えるよう、時系列データに重み付け処理を施して加重データを作成する加重データ作成ステップと、時系列データとして、加重データ作成ステップで作成した加重データを用いてマハラノビス距離を算出するマハラノビス距離算出ステップと、自己回帰モデル算出ステップで算出した推定値と、マハラノビス距離算出ステップで算出したマハラノビス距離とを統合して解析することにより、時系列データの変化の大小をスコアとして換算して出力する変化点スコア算出ステップと、を有している。 Further, the change point detection method according to the present invention is relative to the self-return model calculation step of calculating the estimated value by the self-return model based on the time-series data input from the outside , and the relative according to the novelty of the time-series data. Maharanobis that calculates the Maharanobis distance using the weighted data creation step that creates weighted data by weighting the time series data and the weighted data created in the weighted data creation step as time series data. By integrating and analyzing the distance calculation step, the estimated value calculated in the self-return model calculation step, and the Maharanobis distance calculated in the Maharanobis distance calculation step, the magnitude of the change in the time series data is converted into a score and output. It has a change point score calculation step to be performed.

本発明によれば、自己回帰モデルによる推定値とマハラノビス距離との両方を異常検知に用いることから、時系列データの蓄積量が少ない場合でも精度のよい異常検知を行うことができるため、時系列データの多少を問わず、高精度な異常検知を行うことができる。 According to the present invention, since both the estimated value by the autoregressive model and the Mahalanobis distance are used for abnormality detection, accurate abnormality detection can be performed even when the amount of time-series data accumulated is small. Highly accurate abnormality detection can be performed regardless of the amount of data.

本発明の実施の形態に係る変化点検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the change point detection apparatus which concerns on embodiment of this invention. 図１の変化点検出装置の動作を示すフローチャートである。It is a flowchart which shows the operation of the change point detection apparatus of FIG.

実施の形態．
図１は、本発明の実施の形態に係る変化点検出装置の構成を示すブロック図である。図１に示すように、変化点検出装置１００は、入力部１０と、前処理部２０と、データ蓄積部３０と、自己回帰モデル算出部４０と、自己回帰値蓄積部４５と、加重データ作成部５０と、マハラノビス距離算出部６０と、マハラノビス距離蓄積部６５と、変化点スコア算出部７０と、変化点スコア蓄積部７５と、出力部８０と、を有している。 Embodiment.
FIG. 1 is a block diagram showing a configuration of a change point detection device according to an embodiment of the present invention. As shown in FIG. 1, the change point detection device 100 includes an input unit 10, a preprocessing unit 20, a data storage unit 30, an autoregressive model calculation unit 40, an autoregressive value storage unit 45, and weighted data creation. It has a unit 50, a Mahalanobis distance calculation unit 60, a Mahalanobis distance storage unit 65, a change point score calculation unit 70, a change point score storage unit 75, and an output unit 80.

入力部１０は、外部から得られる時系列データを異常検知の対象として受け付けるものである。外部から得られる時系列データとは、何れも図示はしていないが、加速度センサ、振動センサ、電波センサ、もしくはマイクロフォンといった各種のセンシングデバイスから取得されるセンシングデータ、又はデータベースに蓄積されているエラーログデータなどである。変化点検出装置１００は、サンプリング周期数ミリ秒のセンシングデータでも、１ヶ月のエラー件数の集計といったログデータでも、異常検知の対象となる時系列データとして適用することができる。すなわち、変化点検出装置１００は、サンプリング間隔によらず、様々な時系列データを異常検知の対象として適用することができる。 The input unit 10 receives time-series data obtained from the outside as a target for abnormality detection. Although the time series data obtained from the outside is not shown, the sensing data acquired from various sensing devices such as an accelerometer, a vibration sensor, a radio wave sensor, or a microphone, or an error stored in a database. Log data, etc. The change point detection device 100 can be applied as time-series data to be anomaly detection, whether it is sensing data having a sampling cycle of several milliseconds or log data such as totaling the number of errors in one month. That is, the change point detection device 100 can apply various time-series data as targets for abnormality detection regardless of the sampling interval.

前処理部２０は、入力部１０が受け付けた時系列データを、変化点検出を行うためのパラメータに変換するものである。つまり、前処理部２０は、時系列データの種別に応じて、サブサンプリング又はフィルタリングなどの前処理を行うものである。 The preprocessing unit 20 converts the time series data received by the input unit 10 into parameters for detecting the change point. That is, the preprocessing unit 20 performs preprocessing such as subsampling or filtering according to the type of time series data.

前処理の一例としては、ＡＤ変換により、アナログのセンシングデータをディジタル値に変換する処理、必要最低限のデータをデータ蓄積部３０に蓄積させるためのサブサンプリング、適切な周波数成分のみを抽出するための短時間フーリエ変換、又はデータの値域を規定するための正規化処理がある。すなわち、前処理部２０は、ＡＤ変換、サブサンプリング、短時間フーリエ変換、及び正規化処理のうちの少なくとも１つを実行するように構成されている。また、前処理部２０は、前処理後の時系列データのデータ系列について、時系列順を保持させて各時系列データをデータ蓄積部３０に格納するものである。 As an example of preprocessing, AD conversion is used to convert analog sensing data into digital values, subsampling to store the minimum required data in the data storage unit 30, and extracting only appropriate frequency components. There is a short-time Fourier transform of the data, or a normalization process to specify the value range of the data. That is, the preprocessing unit 20 is configured to execute at least one of AD conversion, subsampling, short-time Fourier transform, and normalization processing. Further, the pre-processing unit 20 stores each time-series data in the data storage unit 30 by maintaining the time-series order of the data series of the time-series data after the pre-processing.

データ蓄積部３０は、過去のデータ系列を蓄積するものである。より具体的に、データ蓄積部３０は、前処理部２０による前処理後の時系列データを逐次格納するものである。ここで、Ｎ個（Ｎは任意の自然数）の時系列データを含むデータ系列を「ｘ_１，ｘ_２，…，ｘ_Ｎ−１，ｘ_Ｎ」と表す。データ系列内の各時系列データは、付された数字が大きいほど新しいデータであるものとする。すなわち、データ系列「ｘ_１，ｘ_２，…，ｘ_Ｎ−１，ｘ_Ｎ」の中では、ｘ_１が最も古い時系列データであり、ｘ_Ｎが最も新しい時系列データである。 The data storage unit 30 stores the past data series. More specifically, the data storage unit 30 sequentially stores the time-series data after the preprocessing by the preprocessing unit 20. Here, a data series including N time series data (N is an arbitrary natural number) is represented _{as "x 1} , x ₂ , ..., X _N-1 , x _N". It is assumed that each time series data in the data series is newer as the number attached is larger. That is, in the data series "x ₁ , x ₂ , ..., X _N-1 , x _N ", x ₁ is the oldest time series data and x _N is the newest time series data.

自己回帰モデル算出部４０は、外部から入力される時系列データをもとに、自己回帰モデル（ａｕｔｏｒｅｇｒｅｓｓｉｖｅｍｏｄｅｌ）により推定値を算出するものである。そして、自己回帰モデル算出部４０は、算出した自己回帰モデルの推定値を、自己回帰値蓄積部４５に逐次格納するものである。つまり、自己回帰値蓄積部４５は、自己回帰モデル算出部４０において算出された推定値を蓄積するものである。 The autoregressive model calculation unit 40 calculates an estimated value by an autoregressive model based on time series data input from the outside. Then, the autoregressive model calculation unit 40 sequentially stores the calculated estimated value of the autoregressive model in the autoregressive value storage unit 45. That is, the autoregressive value accumulating unit 45 accumulates the estimated value calculated by the autoregressive model calculation unit 40.

自己回帰モデルにおいて、データ系列がｐ次のＡＲ（ＡｕｔｏＲｅｇｒｅｓｓｉｖｅ）モデルに従うとき、ｘ_Ｎは、過去のデータ系列を用いて、下記の式１で表される。ここで、φ_ｊ（ｊ＝１，２，…，ｐ）は係数であり、ε_Ｎは期待値がゼロで分散が一定数のホワイトノイズを示す。 In the autoregressive model, when the data series follows the p-th order AR (Auto Regressive) model, x _N is expressed by the following equation 1 using the past data series. Here, φ _j (j = 1, 2, ..., P) is a coefficient, and ε _N indicates white noise with an expected value of zero and a constant variance.

同様に、データ系列がｑ次のＭＡ（ＭｏｖｉｎｇＡｖｅｒａｇｅ）モデルに従うとき、ｘ_Ｎは下記の式２で表される。ここで、θｊ（ｊ＝１，２，…，ｑ）は係数であり、ε_Ｎは期待値がゼロで且つ分散が一定数のホワイトノイズを示す。 Similarly, when the data series follows a q-th order MA (Moving Average) model, x _N is expressed by Equation 2 below. Here, θj (j = 1, 2, ..., Q) is a coefficient, and ε _N indicates white noise having an expected value of zero and a constant variance.

さらに、次数（ｐ，ｑ）のＡＲＭＡ（ＡｕｔｏＲｅｇｒｅｓｓｉｖｅＭｏｖｉｎｇＡｖｅｒａｇｅ）モデルに従うとき、ｘ_Ｎは下記の式３で表される。ここで、 Further, according to the ARMA (Autoregressive Moving Average) model of order (p, q), x _N is expressed by the following equation 3. here,

そして、ＡＲＩＭＡ（ＡｕｔｏＲｅｇｒｅｓｓｉｖｅＩｎｔｅｇｒａｔｅｄＭｏｖｉｎｇＡｖｅｒａｇｅ）モデルは、式３のＡＲＭＡモデルをｄ次階差にしたものであり、ｄ＝１のときは下記の式４で表され、ｄ＝２のときは式３をもとに下記の式５で表される。 The ARIMA (Autoregressive Integrated Moving Average) model is the ARMA model of Equation 3 with the d-th order difference. When d = 1, it is represented by Equation 4 below, and when d = 2, the equation is expressed. Based on 3, it is expressed by the following equation 5.

式１〜式５におけるパラメータ（ｐ，ｄ，ｑ）、係数φ_ｊ，θ_ｊ（ｊ＝１，２，…，ｐ）、およびホワイトノイズε_Ｎは、一般に、ユールウォーカー法又は最尤法を使って解析的に求められる。本実施の形態においても、自己回帰モデル算出部４０は、これらの方法に従ってパラメータの推定を行う。 The parameters (p, d, q), coefficients φ _j , θ _j (j = 1, 2, ..., P) and white noise ε _N in Eqs. 1 to 5 generally refer to the Yulewalker method or the maximum likelihood method. Obtained analytically using. Also in this embodiment, the autoregressive model calculation unit 40 estimates the parameters according to these methods.

上記の処理によって得られた最適な自己回帰モデルの推定値に対して、実際の時系列データｘ_Ｎが大きく異なるとき、その時系列データｘ_Ｎが過去のデータ系列と乖離している、つまり変化点であると判断することができる。これが時系列データを充分に蓄積しているデータ系列、つまり長いデータ系列に対する変化点検出の原理である。すなわち、特に時系列データが充分に蓄積されている場合に有用な変化点検出の原理である。 _{When the actual time series data x N} is significantly different from the estimated value of the optimum autoregressive model obtained by the above processing, the time series data x _N deviates from the past data series, that is, the change point. Can be determined to be. This is the principle of change point detection for a data series in which time series data is sufficiently accumulated, that is, a long data series. That is, it is a principle of change point detection that is particularly useful when time series data is sufficiently accumulated.

加重データ作成部５０は、マハラノビス距離に基づく変化点を検出するための事前処理を行うものである。すなわち、加重データ作成部５０は、時系列データの新しさに応じて相対的に重みが増えるよう、時系列データに重み付け処理を施し、加重データを作成するものである。本実施の形態において、加重データ作成部５０は、データ系列に含まれる各時系列データに対し、新しい時系列データの重みが相対的に増えるように重み付け処理を行うものである。換言すれば、加重データ作成部５０は、データ系列に含まれる各時系列データに対し、古い時系列データの重みが相対的に減少するように重み付け処理を行うものである。 The weighted data creation unit 50 performs preprocessing for detecting a change point based on the Mahalanobis distance. That is, the weighted data creation unit 50 creates weighted data by performing weighting processing on the time-series data so that the weights increase relatively according to the newness of the time-series data. In the present embodiment, the weighted data creation unit 50 performs weighting processing on each time series data included in the data series so that the weight of the new time series data is relatively increased. In other words, the weighted data creation unit 50 performs weighting processing on each time series data included in the data series so that the weight of the old time series data is relatively reduced.

より具体的に、加重データ作成部５０は、Ｎ個の時系列データを含むデータ系列である「ｘ_１，ｘ_２，…，ｘ_Ｎ−１，ｘ_Ｎ」に対して、下記の式６で示す重み付け処理を行うものである。式６において、ｘ’_ｔは加重データのデータ系列である。また、ｉは２以上の整数であり、データ系列内の時系列データの数などに応じて適宜変更することができる。この重み付け処理により、データ系列に含まれる最新の時系列データ（ｘ_Ｎ）から遡ってｉ個より前の時系列データ（ｘ_１，ｘ_２，…，ｘ_{Ｎ−（ｉ＋１）}）の重みが特に低減され、新しい時系列データに対する変化をより鋭敏に検出できるようになる。 More specifically, the weighted data creation unit 50 uses the following equation 6 for _{"x 1} , x ₂ , ..., X _N-1 , x _N ", which is a data series including N time series data. The weighting process shown is performed. In Equation 6, _x't is a data series of weighted data. Further, i is an integer of 2 or more, and can be appropriately changed according to the number of time series data in the data series. By this weighting process, the weights of the time series data (x ₁ , x ₂ , ..., X _{N− (i + 1)} ) before i are particularly weighted _{from the latest time series data (x N) included in the data series.} It will be reduced and changes to new time series data will be detected more sensitively.

マハラノビス距離算出部６０は、外部から入力される時系列データをもとにマハラノビス距離を算出するものである。本実施の形態において、マハラノビス距離算出部６０は、加重データ作成部５０において重み付けされた時系列データ、すなわち加重データに対してマハラノビス距離の算出を行うものである。 The Mahalanobis distance calculation unit 60 calculates the Mahalanobis distance based on time-series data input from the outside. In the present embodiment, the Mahalanobis distance calculation unit 60 calculates the Mahalanobis distance with respect to the weighted time series data, that is, the weighted data in the weighted data creation unit 50.

より具体的に、マハラノビス距離算出部６０は、下記の式７に示す、加重データのデータ系列ｘ’_ｔに対するマハラノビス距離ｄ_ｔを算出するものである。ここで、μはデータ系列の平均を表し、Σは共分散行列を表す。また、マハラノビス距離算出部６０は、算出したマハラノビス距離ｄ_ｔをマハラノビス距離蓄積部６５に逐次格納するものである。つまり、マハラノビス距離蓄積部６５は、マハラノビス距離算出部６０において算出されたマハラノビス距離ｄ_ｔを蓄積するものである。 More specifically, the Mahalanobis distance calculating unit 60 is shown in Equation 7 below, and calculates the Mahalanobis distance d _t to the data sequence x _'t weighted data. Here, μ represents the average of the data series, and Σ represents the covariance matrix. Further, the Mahalanobis distance calculation unit 60 _{sequentially stores the calculated Mahalanobis distance dt} in the Mahalanobis distance storage unit 65. That is, the Mahalanobis distance accumulating unit 65 accumulates the Mahalanobis distance _dt calculated by the Mahalanobis distance calculating unit 60.

このように、加重データ作成部５０による前処理を経てマハラノビス距離算出部６０が算出したマハラノビス距離ｄ_ｔを用いることで、過去のデータ系列の集合からの乖離が大きい時系列データを変化点として検出することができる。これが充分に時系列データを蓄積していないデータ系列、つまり短いデータ系列に対する変化点検出の原理である。すなわち、時系列データの蓄積量が少ない場合にも有用な変化点検出の原理である。 _{In this way, by using the Mahalanobis distance dt} calculated by the Mahalanobis distance calculation unit 60 through preprocessing by the weighted data creation unit 50, time series data having a large deviation from the set of past data series is detected as a change point. can do. This is the principle of change point detection for a data series that does not sufficiently accumulate time series data, that is, a short data series. That is, it is a principle of change point detection that is useful even when the amount of time-series data accumulated is small.

変化点スコア算出部７０は、自己回帰モデルを使った変化点の判断と、マハラノビス距離を用いた変化点の判断との両方を実施し、データ系列の長い短いに依存しない変化点検出を実現するものである。すなわち、変化点スコア算出部７０は、自己回帰モデル算出部４０において算出された推定値と、マハラノビス距離算出部６０において算出されたマハラノビス距離とを統合して解析することにより、時系列データの変化の大小をスコアとして換算するものである。本実施の形態において、変化点スコア算出部７０は、自己回帰モデルによる推定値と、加重移動平均を用いたマハラノビス距離とを、同一の尺度の値に換算することにより、現在の時系列データの異常度合いを数値化するものである。 The change point score calculation unit 70 performs both the determination of the change point using the autoregressive model and the determination of the change point using the Mahalanobis distance, and realizes the change point detection that does not depend on the length of the data series. It is a thing. That is, the change point score calculation unit 70 integrates and analyzes the estimated value calculated by the autoregressive model calculation unit 40 and the Mahalanobis distance calculated by the Mahalanobis distance calculation unit 60, thereby changing the time series data. The size of is converted as a score. In the present embodiment, the change point score calculation unit 70 converts the estimated value by the autoregressive model and the Mahalanobis distance using the weighted moving average into values of the same scale, thereby converting the current time series data. It quantifies the degree of abnormality.

変化点スコア算出部７０は、図１に示すように、第１演算部７１と、第２演算部７２と、比較演算部７３と、を有している。 As shown in FIG. 1, the change point score calculation unit 70 includes a first calculation unit 71, a second calculation unit 72, and a comparison calculation unit 73.

第１演算部７１は、データ蓄積部３０から時系列データを取得し、自己回帰値蓄積部４５から自己回帰モデルの推定値を取得するようになっている。第１演算部７１は、データ蓄積部３０に蓄積された時系列データと、自己回帰値蓄積部４５に蓄積された自己回帰モデルの推定値との差である乖離値Ｄを算出するようになっている。また、第１演算部７１は、過去における乖離値Ｄの最大値を、距離最大値ｄ_ｔＭＡＸとして変化点スコア蓄積部７５に格納するようになっている。第１演算部７１は、距離最大値ｄ_ｔＭＡＸを適宜更新するようになっている。 The first calculation unit 71 acquires time series data from the data storage unit 30, and acquires the estimated value of the autoregressive model from the autoregressive value storage unit 45. The first calculation unit 71 now calculates the deviation value D, which is the difference between the time series data stored in the data storage unit 30 and the estimated value of the autoregressive model stored in the autoregressive value storage unit 45. ing. Further, the first calculation unit 71 stores the maximum value of the deviation value D in the past as the maximum distance value d _tMAX in the change point score accumulating unit 75. The first calculation unit 71 appropriately updates the _{maximum distance value d tMAX.}

第１演算部７１は、算出した乖離値Ｄの乖離最大値Ｄ_ＭＡＸに対する割合、すなわち「Ｄ／Ｄ_ＭＡＸ」を第１変化点スコアとして求めるものである。そして、第１演算部７１は、求めた第１変化点スコアを変化点スコア蓄積部７５に蓄積させるものである。 The first calculation unit 71 obtains the ratio of the calculated dissociation value D to the maximum dissociation value D _MAX , that is, "D / D _MAX " as the first change point score. Then, the first calculation unit 71 stores the obtained first change point score in the change point score storage unit 75.

さらに、第１演算部７１は、第１変化点スコアが１を大きく上回るか否かを判定する変化点抽出機能を有している。より具体的に、第１演算部７１は、第１変化点スコアが１を大きく上回るか否かを判定するときの判定基準をαとすると、第１変化点スコアである「Ｄ／Ｄ_ＭＡＸ」が「１＋判定基準α」よりも大きいか否かを判定する機能を有している。ここで、判定基準αは、「（過去の第１変化点スコアの平均値）＋係数×（過去の第１変化点スコアの標準偏差）」など、第１変化点スコアが正規分布に従うと仮定して設定することができる。係数としては、例えば３を用いることができる。係数は、適宜変更するようにしてもよい。 Further, the first calculation unit 71 has a change point extraction function for determining whether or not the first change point score greatly exceeds 1. More specifically, if the determination criterion for determining whether or not the first change point score greatly exceeds 1 is α, the first calculation unit 71 is the first change point score “D / D _MAX ”. Has a function of determining whether or not is larger than "1 + determination criterion α". Here, the criterion α is assumed that the first change point score follows a normal distribution, such as "(mean value of the first change point score in the past) + coefficient x (standard deviation of the first change point score in the past)". Can be set. As the coefficient, for example, 3 can be used. The coefficient may be changed as appropriate.

また、第１演算部７１は、変化点抽出機能により、第１変化点スコアが１を大きく上回る場合、その時点、すなわち当該第１変化点スコアに対応する時系列データが入力された時点がデータ系列の変化点であると判断するものである。その際、第１演算部７１は、変化点と判断した時点における第１変化点スコアに、出力側の外部機器が他の第１変化点スコアと区別するための識別情報を付加するようになっている。本実施の形態において、第１演算部７１は、変化点と判断した時点における第１変化点スコアに、異常なデータであることを示すフラグを立てるようになっている。 Further, in the first calculation unit 71, when the first change point score greatly exceeds 1, the time point, that is, the time point when the time series data corresponding to the first change point score is input is the data. It is judged to be the change point of the series. At that time, the first calculation unit 71 adds identification information for the external device on the output side to distinguish it from other first change point scores to the first change point score at the time when it is determined to be the change point. ing. In the present embodiment, the first calculation unit 71 sets a flag indicating that the data is abnormal in the first change point score at the time when the change point is determined.

ところで、変化点と判断された時点における第１変化点スコアは、異常を示すものである。そのため、当該第１変化点スコアの元となる乖離値Ｄは、突飛な値であると考えられ、乖離最大値Ｄ_ＭＡＸとして用いることは適当でない。よって、第１演算部７１は、変化点と判断された時点における第１変化点スコアの元となる乖離値Ｄによっては、乖離最大値Ｄ_ＭＡＸを更新しないようになっている。 By the way, the first change point score at the time when it is determined to be the change point indicates an abnormality. Therefore, the dissociation value D, which is the basis of the first change point score, is considered to be an outlandish value, and it is not appropriate to use it as the _{maximum dissociation value D MAX.} Therefore, the first calculation unit 71 does not update the _{maximum deviation value D MAX} depending on the deviation value D that is the source of the first change point score at the time when it is determined to be the change point.

第２演算部７２は、マハラノビス距離蓄積部６５からマハラノビス距離ｄ_ｔを取得するようになっている。また、第２演算部７２は、過去におけるマハラノビス距離ｄ_ｔの最大値を、距離最大値ｄ_ｔＭＡＸとして変化点スコア蓄積部７５に格納するようになっている。第２演算部７２は、距離最大値ｄ_ｔＭＡＸを適宜更新するようになっている。 The second calculation unit 72 acquires the _{Mahalanobis distance dt} from the Mahalanobis distance storage unit 65. Further, the second calculation unit 72 stores _{the maximum value of the Mahalanobis distance dt} in the past as the maximum distance value _dtMAX in the change point score accumulating unit 75. The second calculation unit 72 appropriately updates the _{maximum distance value d tMAX.}

第２演算部７２は、取得したマハラノビス距離ｄ_ｔを用いて、第１変化点スコアと同一の尺度である第２変化点スコアを求めるものである。より具体的に、第２演算部７２は、距離最大値ｄ_ｔＭＡＸに対する、現在の時系列データのマハラノビス距離ｄ_ｔの割合、すなわち「ｄ_ｔ／ｄ_ｔＭＡＸ」を第２変化点スコアとして求めるものである。そして、第２演算部７２は、求めた第２変化点スコアを変化点スコア蓄積部７５に蓄積させるものである。 The second calculation unit 72 uses the acquired Mahalanobis distance _dt to obtain the second change point score, which is the same scale as the first change point score. More specifically, the second calculation unit 72 obtains _{the ratio of the Mahalanobis distance dt} of the current time series data to the _{maximum distance value d tMAX} , that is, " _dt / d _tMAX " as the second change point score. is there. Then, the second calculation unit 72 stores the obtained second change point score in the change point score storage unit 75.

さらに、第２演算部７２は、第２変化点スコアが１を大きく上回るか否かを判定する変化点抽出機能を有している。より具体的に、第２演算部７２は、第２変化点スコアが１を大きく上回るか否かを判定するときの判定基準をβとすると、第２変化点スコアである「ｄ_ｔ／ｄ_ｔＭＡＸ」が「１＋判定基準β」よりも大きいか否かを判定する機能を有している。ここで、判定基準βは、「（過去の第２変化点スコアの平均値）＋係数×（過去の第２変化点スコアの標準偏差）」など、第２変化点スコアが正規分布に従うと仮定して設定することができる。係数としては、例えば３を用いることができる。係数は、適宜変更するようにしてもよい。 Further, the second calculation unit 72 has a change point extraction function for determining whether or not the second change point score greatly exceeds 1. More specifically, if the determination criterion for determining whether or not the second change point score greatly exceeds 1 is β, the second calculation unit 72 is the second change point score “ _dt / d _tMAX”. Has a function of determining whether or not "" is larger than "1 + determination criterion β". Here, the criterion β is assumed that the second change point score follows a normal distribution, such as "(mean value of the past second change point score) + coefficient x (standard deviation of the past second change point score)". Can be set. As the coefficient, for example, 3 can be used. The coefficient may be changed as appropriate.

また、第２演算部７２は、変化点抽出機能により、第２変化点スコアが１を大きく上回る場合、その時点、すなわち当該第２変化点スコアに対応する時系列データが入力された時点がデータ系列の変化点であると判断するものである。その際、第２演算部７２は、変化点と判断した時点における第２変化点スコアに、出力側の外部機器が他の第２変化点スコアと区別するための識別情報を付加するようになっている。本実施の形態において、第２演算部７２は、変化点と判断した時点における第２変化点スコアに、異常なデータであることを示すフラグを立てるようになっている。 Further, the second calculation unit 72 data when the second change point score greatly exceeds 1 by the change point extraction function, that is, the time point when the time series data corresponding to the second change point score is input. It is judged to be the change point of the series. At that time, the second calculation unit 72 adds identification information for the external device on the output side to distinguish it from other second change point scores to the second change point score at the time when it is determined to be the change point. ing. In the present embodiment, the second calculation unit 72 sets a flag indicating that the data is abnormal in the second change point score at the time when the change point is determined.

ところで、変化点と判断された時点における第２変化点スコアは、異常を示すものである。そのため、当該第２変化点スコアの元となるマハラノビス距離ｄ_ｔは、突飛な値であると考えられ、距離最大値ｄ_ｔＭＡＸとして用いることは適当でない。よって、第２演算部７２は、変化点と判断された時点における第２変化点スコアの元となるマハラノビス距離ｄ_ｔによっては、距離最大値ｄ_ｔＭＡＸを更新しないようになっている。 By the way, the second change point score at the time when it is determined to be the change point indicates an abnormality. _{Therefore, the Mahalanobis distance dt, which} is the source of the second change point score, is considered to be an outlandish value, and it is not appropriate to use it as the _{maximum distance value dtMAX.} Therefore, the second calculation unit 72 does not update the maximum distance value _dtMAX _{depending on the Mahalanobis distance dt} which is the source of the second change point score at the time when it is determined to be the change point.

比較演算部７３は、複数の時系列データの各々について、第１演算部７１が求めた第１変化点スコアと、第２演算部７２が求めた第２変化点スコアとの大小を比較するものである。そして、比較演算部７３は、複数の時系列データの各々について、第１変化点スコアと第２変化点スコアとのうちで大きい方を変化点スコアとして出力部８０に受け渡すものである。 The comparison calculation unit 73 compares the magnitude of the first change point score obtained by the first calculation unit 71 and the second change point score obtained by the second calculation unit 72 for each of the plurality of time series data. Is. Then, the comparison calculation unit 73 passes the larger of the first change point score and the second change point score to the output unit 80 as the change point score for each of the plurality of time series data.

すなわち、変化点スコア算出部７０は、自己回帰モデル算出部４０によって自己回帰値蓄積部４５に蓄積される自己回帰モデルの推定値と、マハラノビス距離算出部６０によってマハラノビス距離蓄積部６５に蓄積されるマハラノビス距離ｄ_ｔとを用いて、変化点の判断を行うものである。変化点スコア蓄積部７５は、第１演算部７１が求めた第１変化点スコアと、第２演算部７２が求めた第２変化点スコアとを蓄積するものである。 That is, the change point score calculation unit 70 stores the estimated value of the autoregressive model accumulated in the autoregressive value storage unit 45 by the autoregressive model calculation unit 40 and the Mahalanobis distance storage unit 65 by the Mahalanobis distance calculation unit 60. The Mahalanobis distance _dt is used to determine the change point. The change point score accumulating unit 75 accumulates the first change point score obtained by the first calculation unit 71 and the second change point score obtained by the second calculation unit 72.

出力部８０は、現在の変化点スコアを外部に出力するものである。すなわち、比較演算部７３から逐次受け渡される変化点スコアを最終的な検出結果として出力するものである。出力部８０から出力される変化点スコアは、例えば、ネットワークトラフィック、又は道路の交通流の異常検知など、様々な用途に利用することができる。 The output unit 80 outputs the current change point score to the outside. That is, the change point score sequentially passed from the comparison calculation unit 73 is output as the final detection result. The change point score output from the output unit 80 can be used for various purposes such as network traffic or anomaly detection of road traffic flow.

また、本実施の形態において、第１演算部７１および第２演算部７２は、変化点と判断した時点における変化点スコアにフラグを立てるようになっている。そのため、出力部８０に接続された出力側の外部機器は、第１演算部７１又は第２演算部７２によって付されたフラグに応じて、例えば、利用者に注意を促すためのアラートを表示することができる。 Further, in the present embodiment, the first calculation unit 71 and the second calculation unit 72 set a flag at the change point score at the time when the change point is determined. Therefore, the external device on the output side connected to the output unit 80 displays, for example, an alert for calling attention to the user according to the flag attached by the first calculation unit 71 or the second calculation unit 72. be able to.

ここで、変化点検出装置１００の上記各機能は、回路デバイスなどのハードウェアで実現することもできるし、例えば、マイコン、ＤＳＰ（ＤｉｇｉｔａｌＳｉｇｎａｌＰｒｏｃｅｓｓｏｒ）、又はＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等の演算装置上で実行されるソフトウェアとして実現することもできる。また、データ蓄積部３０、自己回帰値蓄積部４５、マハラノビス距離蓄積部６５、および変化点スコア蓄積部７５は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）及びＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、フラッシュメモリ等のＰＲＯＭ（ＰｒｏｇｒａｍｍａｂｌｅＲＯＭ）、又はＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）等により構成することができる。 Here, each of the above-mentioned functions of the change point detection device 100 can be realized by hardware such as a circuit device, and for example, an arithmetic unit such as a microcomputer, a DSP (Digital Signal Processor), or a CPU (Central Processing Unit). It can also be implemented as software running on. Further, the data storage unit 30, the self-return value storage unit 45, the maharanobis distance storage unit 65, and the change point score storage unit 75 include a RAM (Random Access Memory), a ROM (Read Only Memory), a ROM (Programmable) such as a flash memory, and the like. It can be configured by ROM), HDD (Hard Disk Drive), or the like.

図２は、図１の変化点検出装置の動作を示すフローチャートである。図２を参照して、本実施の形態における変化点検出方法について説明する。 FIG. 2 is a flowchart showing the operation of the change point detection device of FIG. The change point detection method according to the present embodiment will be described with reference to FIG.

まず、入力部１０は、外部から得られる時系列データを受け付けて、前処理部２０に受け渡す（ステップＳ１０１）。前処理部２０は、入力部１０から取得した時系列データに前処理を施し、前処理後の時系列データをデータ蓄積部３０に記憶させる。つまり、前処理部２０は、時系列データに対する前処理として、必要に応じて、ＡＤ変換、サブサンプリング、短時間フーリエ変換、及び正規化処理のうちの少なくとも１つを実行する（ステップＳ１０２）。 First, the input unit 10 receives the time series data obtained from the outside and passes it to the preprocessing unit 20 (step S101). The pre-processing unit 20 performs pre-processing on the time-series data acquired from the input unit 10, and stores the time-series data after the pre-processing in the data storage unit 30. That is, the preprocessing unit 20 executes at least one of AD conversion, subsampling, short-time Fourier transform, and normalization processing as preprocessing for the time series data, if necessary (step S102).

ここで、変化点検出装置１００は、長いデータ系列の変化点を検出するための処理として、自己回帰モデルに基づく変化点検出を実行する。すなわち、自己回帰モデル算出部４０は、データ蓄積部３０から前処理後の時系列データを取得し、自己回帰モデルにより推定値を算出する（ステップＳ１０３：自己回帰モデル算出ステップ）。 Here, the change point detection device 100 executes change point detection based on an autoregressive model as a process for detecting a change point in a long data series. That is, the autoregressive model calculation unit 40 acquires the time-series data after preprocessing from the data storage unit 30 and calculates the estimated value by the autoregressive model (step S103: autoregressive model calculation step).

次に、第１演算部７１は、データ蓄積部３０に蓄積された時系列データと、自己回帰値蓄積部４５に蓄積された自己回帰モデルの推定値との差である乖離値Ｄを算出する。次いで、第１演算部７１は、算出した乖離値Ｄの乖離最大値Ｄ_ＭＡＸに対する割合である第１変化点スコアを求める。そして、第１演算部７１は、求めた第１変化点スコアを変化点スコア蓄積部７５に格納する（ステップＳ１０４：第１演算ステップ）。 Next, the first calculation unit 71 calculates the deviation value D, which is the difference between the time series data stored in the data storage unit 30 and the estimated value of the autoregressive model stored in the autoregressive value storage unit 45. .. Next, the first calculation unit 71 obtains the first change point score, which is the ratio of the calculated dissociation value D to the maximum dissociation value D _MAX. Then, the first calculation unit 71 stores the obtained first change point score in the change point score accumulation unit 75 (step S104: first calculation step).

また、変化点検出装置１００は、短いデータ系列の変化点を検出するための処理として、マハラノビス距離に基づく変化点検出を、自己回帰モデルに基づく変化点検出と並行して実施する。すなわち、加重データ作成部５０は、時系列データの新しさに応じて相対的に重みが増えるよう、式６に基づく重み付け処理を時系列データに施し、加重データを作成する（ステップＳ１０５：加重データ作成ステップ）。次いで、マハラノビス距離算出部６０は、式７に基づき、加重データ作成部５０において作成された加重データに対するマハラノビス距離を算出する（ステップＳ１０６：マハラノビス距離算出ステップ）。 Further, the change point detection device 100 performs change point detection based on the Mahalanobis distance in parallel with change point detection based on the autoregressive model as a process for detecting the change point of a short data series. That is, the weighted data creation unit 50 applies weighting processing based on Equation 6 to the time-series data so that the weights increase relatively according to the newness of the time-series data, and creates the weighted data (step S105: weighted data). Creation step). Next, the Mahalanobis distance calculation unit 60 calculates the Mahalanobis distance with respect to the weighted data created by the weighted data creation unit 50 based on the equation 7 (step S106: Mahalanobis distance calculation step).

次に、第２演算部７２は、現在の時系列データのマハラノビス距離ｄ_ｔの距離最大値ｄ_ｔＭＡＸに対する割合である第２変化点スコアを求める。そして、第２演算部７２は、求めた第２変化点スコアを変化点スコア蓄積部７５に格納する（ステップＳ１０７：第２演算ステップ）。 Next, the second calculation unit 72 obtains the second change point score, which is the ratio of _{the Mahalanobis distance dt} of the current time series data to the maximum distance _dtMAX. Then, the second calculation unit 72 stores the obtained second change point score in the change point score accumulation unit 75 (step S107: second calculation step).

続いて、比較演算部７３は、第１演算部７１が求めた第１変化点スコアと、第２演算部７２が求めた第２変化点スコアとのうちで大きい方を、最終的な検出結果としての変化点スコアに決定し、決定した変化点スコアを出力部８０に受け渡す（ステップＳ１０８：比較演算ステップ）。出力部８０は、比較演算部７３から受け渡された変化点スコアを検出結果として出力する（ステップＳ１０９）。なお、上記ステップＳ１０４、ステップＳ１０７、およびステップＳ１０８の工程は、本発明の「変化点スコア算出ステップ」に相当する。変化点検出装置１００は、上記ステップＳ１０１からＳ１０９までの一連の処理を、データ系列ごとに各時系列データに対して実行する。 Subsequently, the comparison calculation unit 73 determines the larger of the first change point score obtained by the first calculation unit 71 and the second change point score obtained by the second calculation unit 72 as the final detection result. The change point score is determined as, and the determined change point score is passed to the output unit 80 (step S108: comparison calculation step). The output unit 80 outputs the change point score passed from the comparison calculation unit 73 as a detection result (step S109). The steps of step S104, step S107, and step S108 correspond to the "change point score calculation step" of the present invention. The change point detection device 100 executes a series of processes from steps S101 to S109 for each time series data for each data series.

このように、本実施の形態における変化点検出方法は、自己回帰モデルと加重移動平均を用いたマハラノビス距離とを利用することにより、過去のデータ系列が多い場合と少ない場合の何れにおいても、時系列データのパターンが変化する部分を精度よく検知することができる。ところで、上記の変化点検出方法は、主に長いデータ系列の変化点を検出するための処理（ステップＳ１０３、Ｓ１０４）と、主に短いデータ系列の変化点を検出するための処理（ステップＳ１０５〜Ｓ１０７）と、を行うようになっているが、長い短いというのは相対的な尺度である。つまり、上記の変化点検出方法は、データ系列の長い短い、すなわち時系列データの蓄積量の多い少ないを一定の閾値で規定することなく、並行して各処理を進めるため、変化点の取りこぼしが発生しづらいという利点がある。 As described above, the change point detection method in the present embodiment uses the autoregressive model and the Mahalanobis distance using the weighted moving average, regardless of whether the past data series is large or small. It is possible to accurately detect the part where the pattern of the series data changes. By the way, the above-mentioned change point detection method mainly includes a process for detecting a change point in a long data series (steps S103 and S104) and a process for detecting a change point in a short data series (steps S105 and S105). S107) and, but long and short are relative measures. That is, in the above-mentioned change point detection method, each process is carried out in parallel without defining a long and short data series, that is, a large amount and a small amount of time-series data accumulated by a certain threshold value, so that the change point is missed. There is an advantage that it is hard to occur.

以上のように、変化点検出装置１００は、自己回帰モデルによる推定値とマハラノビス距離との両方を異常検知に用いることから、時系列データの蓄積量が少ない場合でも精度のよい異常検知を行うことができるため、時系列データの多少を問わず、高精度な異常検知を行うことができる。すなわち、変化点検出装置１００は、比較的長い期間に亘って蓄積された時系列データから変化点を検出するために、自己回帰モデルを使って指標を求める処理を行い、比較的短い期間内に蓄積された時系列データから変化点を検出するために、マハラノビス距離という指標を求める処理を行う。そのため、過去に得られた正常なデータ系列が充分にある場合はもとより、従来の手法では検出することが困難であった過去のデータ系列が少ない場合においても、異常検知を精度よく行うことができる。そして、変化点検出装置１００は、データ系列の長い短いを一定の閾値で規定することなく、自己回帰モデルに基づく処理とマハラノビス距離に基づく処理とを並行して行うため、変化点の取りこぼしの発生を抑制することができる。 As described above, since the change point detection device 100 uses both the estimated value by the autoregressive model and the Mahalanobis distance for abnormality detection, accurate abnormality detection can be performed even when the accumulated amount of time series data is small. Therefore, it is possible to perform highly accurate abnormality detection regardless of the amount of time series data. That is, the change point detection device 100 performs a process of obtaining an index using an autoregressive model in order to detect a change point from time series data accumulated over a relatively long period of time, and within a relatively short period of time. In order to detect the change point from the accumulated time series data, a process to obtain an index called Mahalanobis distance is performed. Therefore, it is possible to accurately detect anomalies not only when there are sufficient normal data series obtained in the past but also when there are few past data series that were difficult to detect by the conventional method. .. Then, since the change point detection device 100 performs the processing based on the autoregressive model and the processing based on the Mahalanobis distance in parallel without defining the long and short of the data series with a constant threshold value, the change point is missed. Can be suppressed.

また、第１変化点スコアと第２変化点スコアとは同一の尺度の値であり、変化点検出装置１００は、第１変化点スコアと第２変化点スコアとのうちで大きい方を出力するようになっている。そのため、より変化の大きな出力値をもとに異常の発生の有無を判断できることから、異常の発見の効率化を図ることができる。 Further, the first change point score and the second change point score are values of the same scale, and the change point detection device 100 outputs the larger of the first change point score and the second change point score. It has become like. Therefore, since it is possible to determine whether or not an abnormality has occurred based on an output value with a larger change, it is possible to improve the efficiency of finding an abnormality.

上述した実施の形態は、変化点検出装置および変化点検出方法における好適な具体例であり、本発明の技術的範囲は、これらの態様に限定されるものではない。例えば、上記実施の形態では、変化点検出装置１００が、第１変化点スコアと第２変化点スコアとのうちで大きい方を出力する場合を例示したが、これに限定されるものではない。例えば、変化点スコア算出部７０は、比較演算部７３を設けずに構成し、第１変化点スコアと第２変化点スコアとの双方を出力部８０に受け渡すようにしてもよい。ただし、このようにすると、出力される変化点が多くなり、そこから本質的な異常を見極める手間が増加する可能性がある。この点、本実施の形態における変化点検出装置１００は、時系列データの変化の度合いを、同一の尺度である第１変化点スコアと第２変化点スコアとに指標化して出力するようになっている。そのため、出力された各変化点スコアの優先度の判断が容易となり、効率的に異常を見出すことができる。もっとも、比較演算部７３は、第１変化点スコアと第２変化点スコアとの大小比較を行った上で、大きい方にフラグを立てて双方を出力するようにしてもよい。このようにすれば、当該フラグに応じて、出力側の外部機器に、例えば利用者に注意を促すためのアラートを表示させることが可能となり、さらに異常の発見の効率化を図ることができる。 The above-described embodiment is a suitable specific example in the change point detection device and the change point detection method, and the technical scope of the present invention is not limited to these modes. For example, in the above embodiment, the case where the change point detecting device 100 outputs the larger of the first change point score and the second change point score is illustrated, but the present invention is not limited to this. For example, the change point score calculation unit 70 may be configured without providing the comparison calculation unit 73, and both the first change point score and the second change point score may be passed to the output unit 80. However, if this is done, the number of output points of change will increase, and there is a possibility that the time and effort required to identify the essential abnormality will increase. In this regard, the change point detection device 100 in the present embodiment indexes the degree of change of the time series data into a first change point score and a second change point score, which are the same scale, and outputs the index. ing. Therefore, it becomes easy to determine the priority of each output change point score, and it is possible to efficiently find an abnormality. However, the comparison calculation unit 73 may perform a magnitude comparison between the first change point score and the second change point score, and then set a flag on the larger one to output both. By doing so, it is possible to display an alert for calling attention to the user, for example, on the external device on the output side according to the flag, and it is possible to further improve the efficiency of finding an abnormality.

１０入力部、２０前処理部、３０データ蓄積部、４０自己回帰モデル算出部、４５自己回帰値蓄積部、５０加重データ作成部、６０マハラノビス距離算出部、６５マハラノビス距離蓄積部、７０変化点スコア算出部、７１第１演算部、７２第２演算部、７３比較演算部、７５変化点スコア蓄積部、８０出力部、１００変化点検出装置、Ｄ乖離値、Ｄ_ＭＡＸ乖離最大値、ｄ_ｔマハラノビス距離、ｄ_ｔＭＡＸ距離最大値。 10 Input unit, 20 Preprocessing unit, 30 Data storage unit, 40 Autoregressive model calculation unit, 45 Autoregressive value storage unit, 50 Weighted data creation unit, 60 Mahalanobis distance calculation unit, 65 Mahalanobis distance storage unit, 70 Change point score Calculation unit, 71 1st calculation unit, 72 2nd calculation unit, 73 comparison calculation unit, 75 change point score accumulation unit, 80 output unit, 100 change point detector, D deviation value, D _MAX deviation maximum value, _dt Mahalanobis Distance, _dtMAX distance maximum value.

Claims

外部から入力される時系列データをもとに自己回帰モデルによる推定値を算出する自己回帰モデル算出部と、
前記時系列データの新しさに応じて相対的に重みが増えるよう、前記時系列データに重み付け処理を施して加重データを作成する加重データ作成部と、
前記時系列データとして、前記加重データ作成部において作成された前記加重データを用いてマハラノビス距離を算出するマハラノビス距離算出部と、
自己回帰モデル算出部において算出された前記推定値と、前記マハラノビス距離算出部において算出された前記マハラノビス距離とを統合して解析することにより、前記時系列データの変化の大小をスコアとして換算して出力する変化点スコア算出部と、を有する変化点検出装置。 The autoregressive model calculation unit that calculates the estimated value by the autoregressive model based on the time series data input from the outside,
A weighted data creation unit that creates weighted data by weighting the time-series data so that the weights increase relatively according to the newness of the time-series data.
As the time series data, the Mahalanobis distance calculation unit that calculates the Mahalanobis distance using the weighted data created by the weighted data creation unit, and the Mahalanobis distance calculation unit.
By integrating and analyzing the estimated value calculated by the autoregressive model calculation unit and the Mahalanobis distance calculated by the Mahalanobis distance calculation unit, the magnitude of the change in the time series data is converted as a score. A change point detection device having a change point score calculation unit and an output change point score calculation unit.

前記変化点スコア算出部は、
現在における前記時系列データと前記推定値との差である乖離値を算出し、前記乖離値に基づいて第１変化点スコアを求める第１演算部と、
前記マハラノビス距離を換算して、前記第１変化点スコアと同一の尺度である第２変化点スコアを求める第２演算部と、
前記第１変化点スコアと前記第２変化点スコアとのうちで大きい方を前記スコアとして出力する比較演算部と、を有する請求項１に記載の変化点検出装置。 The change point score calculation unit
A first calculation unit that calculates a divergence value, which is the difference between the current time series data and the estimated value, and obtains a first change point score based on the divergence value.
A second calculation unit that converts the Mahalanobis distance to obtain a second change point score, which is the same scale as the first change point score.
The change point detection device according to claim 1 , further comprising a comparison calculation unit that outputs the larger of the first change point score and the second change point score as the score.

前記変化点スコア算出部は、
現在における前記時系列データと前記推定値との差である乖離値を算出すると共に、当該乖離値の、過去における前記乖離値の最大値に対する割合を第１変化点スコアとして求める第１演算部と、
現在における前記マハラノビス距離の、過去における前記マハラノビス距離の最大値に対する割合を第２変化点スコアとして求める第２演算部と、を有する請求項１に記載の変化点検出装置。 The change point score calculation unit
With the first calculation unit that calculates the divergence value that is the difference between the current time series data and the estimated value, and obtains the ratio of the divergence value to the maximum value of the divergence value in the past as the first change point score. ,
The Mahalanobis distance in the current, the change-point detection apparatus according to claim 1 having, a second arithmetic unit for determining a ratio as a second change point score for the maximum value of the Mahalanobis distance in the past.

前記変化点スコア算出部は、
前記第１変化点スコアと前記第２変化点スコアとのうちで大きい方を前記スコアとして出力する比較演算部を有する請求項３に記載の変化点検出装置。 The change point score calculation unit
The change point detection device according to claim 3 , further comprising a comparison calculation unit that outputs the larger of the first change point score and the second change point score as the score.

前記時系列データに対する前処理として、ＡＤ変換、サブサンプリング、短時間フーリエ変換、及び正規化処理のうちの少なくとも１つを実行する前処理部をさらに有する請求項１〜４の何れか一項に記載の変化点検出装置。 The item according to any one of claims 1 to 4 , further comprising a preprocessing unit that executes at least one of AD conversion, subsampling, short-time Fourier transform, and normalization processing as preprocessing for the time series data. The change point detector described.

外部から入力される時系列データをもとに自己回帰モデルによる推定値を算出する自己回帰モデル算出ステップと、
前記時系列データの新しさに応じて相対的に重みが増えるよう、前記時系列データに重み付け処理を施して加重データを作成する加重データ作成ステップと、
前記時系列データとして、前記加重データ作成ステップで作成した前記加重データを用いてマハラノビス距離を算出するマハラノビス距離算出ステップと、
前記自己回帰モデル算出ステップで算出した前記推定値と、前記マハラノビス距離算出ステップで算出した前記マハラノビス距離とを統合して解析することにより、前記時系列データの変化の大小をスコアとして換算して出力する変化点スコア算出ステップと、を有する変化点検出方法。 An autoregressive model calculation step that calculates an estimated value by an autoregressive model based on time series data input from the outside,
A weighted data creation step in which weighted data is created by weighting the time-series data so that the weights increase relatively according to the newness of the time-series data.
The Mahalanobis distance calculation step for calculating the Mahalanobis distance using the weighted data created in the weighted data creation step as the time series data, and the Mahalanobis distance calculation step.
By integrating and analyzing the estimated value calculated in the autoregressive model calculation step and the Mahalanobis distance calculated in the Mahalanobis distance calculation step, the magnitude of the change in the time series data is converted into a score and output. A change point detection method having a change point score calculation step and a change point score calculation step.

前記変化点スコア算出ステップは、
現在における前記時系列データと前記推定値との差である乖離値を算出し、前記乖離値に基づいて第１変化点スコアを求める第１演算ステップと、
前記マハラノビス距離を換算して、前記第１変化点スコアと同一の尺度である第２変化点スコアを求める第２演算ステップと、
前記第１変化点スコアと前記第２変化点スコアとのうちで大きい方を前記スコアとして出力する比較演算ステップと、を有する請求項６に記載の変化点検出方法。 The change point score calculation step is
The first calculation step of calculating the divergence value, which is the difference between the current time series data and the estimated value, and obtaining the first change point score based on the divergence value,
The second calculation step of converting the Mahalanobis distance to obtain the second change point score, which is the same scale as the first change point score,
The change point detection method according to claim 6 , further comprising a comparison calculation step of outputting the larger of the first change point score and the second change point score as the score.

前記変化点スコア算出ステップは、
現在における前記時系列データと前記推定値との差である乖離値を算出すると共に、当該乖離値の、過去における前記乖離値の最大値に対する割合を第１変化点スコアとして求める第１演算ステップと、
現在における前記マハラノビス距離の、過去における前記マハラノビス距離の最大値に対する割合を第２変化点スコアとして求める第２演算ステップと、を有する請求項６に記載の変化点検出方法。 The change point score calculation step is
With the first calculation step of calculating the divergence value which is the difference between the current time series data and the estimated value and obtaining the ratio of the divergence value to the maximum value of the divergence value in the past as the first change point score. ,
The change point detection method according to claim 6 , further comprising a second calculation step of obtaining the ratio of the current Mahalanobis distance to the maximum value of the Mahalanobis distance in the past as a second change point score.

前記変化点スコア算出ステップは、
前記第１変化点スコアと前記第２変化点スコアとのうちで大きい方を前記スコアとして出力する比較演算ステップをさらに有する請求項８に記載の変化点検出方法。 The change point score calculation step is
The change point detection method according to claim 8 , further comprising a comparison calculation step in which the larger of the first change point score and the second change point score is output as the score.

前記自己回帰モデル算出ステップと前記マハラノビス距離算出ステップとに先立って、前記時系列データに対する前処理として、ＡＤ変換、サブサンプリング、短時間フーリエ変換、及び正規化処理のうちの少なくとも１つを実行する前処理ステップをさらに有する請求項７〜９の何れか一項に記載の変化点検出方法。 Prior to the autoregressive model calculation step and the Mahalanobis distance calculation step, at least one of AD conversion, subsampling, short-time Fourier transform, and normalization processing is executed as preprocessing for the time series data. The change point detection method according to any one of claims 7 to 9 , further comprising a pretreatment step.