WO2023209799A1 - Change point score calculation device, change point score calculation method, and program - Google Patents

Change point score calculation device, change point score calculation method, and program Download PDF

Info

Publication number
WO2023209799A1
WO2023209799A1 PCT/JP2022/018871 JP2022018871W WO2023209799A1 WO 2023209799 A1 WO2023209799 A1 WO 2023209799A1 JP 2022018871 W JP2022018871 W JP 2022018871W WO 2023209799 A1 WO2023209799 A1 WO 2023209799A1
Authority
WO
WIPO (PCT)
Prior art keywords
cluster
cluster transition
distance
change point
centroid
Prior art date
Application number
PCT/JP2022/018871
Other languages
French (fr)
Japanese (ja)
Inventor
彰子 高橋
恵 竹下
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2022/018871 priority Critical patent/WO2023209799A1/en
Publication of WO2023209799A1 publication Critical patent/WO2023209799A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring

Definitions

  • the present disclosure relates to calculating a change point score in consideration of distance.
  • system state refers to the operating state of the system expressed by quantitative variables such as "number of accesses" and "number of users.”
  • Non-Patent Document 1 A technique described in Non-Patent Document 1 is known as a technique for detecting a change point in time-series data to which no correct label regarding the position of the change point is attached.
  • the method of Non-Patent Document 1 is an expanded method of detecting change points by clustering, and since it is a clustering-based method, the target time series is not subject to constraints such as stationarity constraints and independent and equal distribution constraints. do not have.
  • the method of Non-Patent Document 1 clusters the time windows at each point in time series data, and then tracks the clusters assigned to each point in the time axis direction to extract transition patterns. It can be said that this method introduced the concept of
  • the method of Non-Patent Document 1 sets a past period and a current period that are sufficiently longer than the time window, and calculates a change point score by comparing the distribution of cluster transition patterns in both periods. This method is capable of detecting change points, including changes in time-series patterns, because the change point score is calculated for interval data with a fixed time width rather than shot data.
  • the change point detection device proposed in Non-Patent Document 1 is time-series data representing the system state at each point in time of a system composed of one or more devices, and the device constituting the system.
  • a time window generation unit that generates converted data by converting it into data with dimensions of x number of items x time window length, and a change point score of the system state calculated based on the converted data at each point in time are set in advance. It has a function (a clustering unit, a cluster transition sequence creation unit, a cluster transition tensor calculation unit, a change point score calculation unit, and a detection unit) to detect a change point when the threshold value is exceeded.
  • the change point score calculation unit executes the following mean square error as a method of calculating the change point score, that is, the distance between cluster transition tensors between the past period and the current period.
  • D 1 and D 2 are cluster transition tensors in the past period and current period, respectively
  • d 1 c1,..., cL and d 2 c1,..., cL are cluster transition patterns ⁇ c1, c2,... . . , cL ⁇ are elements of tensors D 1 and D 2 that store the stay probabilities.
  • the distance calculation method for example, if there is a cluster transition pattern that has not been observed in the past period but is newly observed in the current period, the newly observed cluster transition pattern is not observed in the past. Even if the cluster transition pattern is far away from, or similar to, other cluster transition patterns, the distance calculation will give the same result. Conversely, if there is a cluster transition pattern that was observed in the past period but not in the current period, the cluster transition pattern observed in the past is far different from other cluster transition patterns currently observed. Even if they are similar, the distance calculation will still yield the same result.
  • the change point score that is, the past period.
  • the present invention has been made in view of the above points, and is applicable to cases where a cluster transition pattern that is far from what has been observed in the past is newly observed, or a cluster transition pattern that is far from what is currently observed.
  • the aim is to introduce a mechanism that promotes an increase in the change point score when a change point has been observed in the past, and to achieve more precise change point detection.
  • the invention according to claim 1 provides data in the dimensions of number of devices x number of items x time window length at each point in time that constitutes time series data of the past period and the current period, generated by the clustering unit.
  • the centroid coordinate input section inputs the centroid coordinates that are the centers of all clusters assigned to
  • the cluster transition tensor calculation section inputs all cluster transition patterns that appeared in the past and current periods, extracted by the cluster transition tensor calculation section.
  • a cluster transition pattern input section that inputs the cluster transition tensors for the past period and the current period calculated in the cluster transition tensor calculation section; an inter-centroid distance matrix calculation unit that calculates an inter-centroid distance matrix for all cluster pairs based on the centroid coordinates of the clusters; and all the cluster transition patterns input by the cluster transition pattern input unit.
  • an inter-cluster transition pattern distance matrix calculation that calculates a distance matrix for all the cluster transition pattern pairs based on the inter-centroid distance matrix for all the cluster pairs calculated by the inter-centroid distance matrix calculation unit.
  • the cluster transition tensors for each of the past and current periods input by the cluster transition tensor input unit and a distance matrix for all pairs of cluster transition patterns calculated by the inter-cluster transition pattern distance matrix calculation unit.
  • a change point score calculation unit that calculates a distance between the cluster transition tensors of a past period and a current period, taking into account the distance between the cluster transition patterns, based on the above.
  • a cluster transition pattern that is far different from that observed in the past is newly observed, or a cluster transition pattern that is far different from what is currently observed is observed in the past.
  • a mechanism is introduced to encourage an increase in the change point score, which has the effect of realizing more precise change point detection.
  • FIG. 7 is a flowchart illustrating a process of detecting a change point.
  • FIG. 2 is a diagram illustrating an example of a functional configuration of a change point score calculation device that takes distance into consideration according to the present embodiment.
  • 12 is a flowchart illustrating an example of a change point score calculation process that takes distance into consideration according to the present embodiment. It is a figure showing an example of the hardware constitutions of a change point detection device and a change point score calculation device.
  • time-series data representing the system state at each point in time of a system (S) consisting of one or more devices, and when some change occurs in the system state, we use the point of occurrence as a change point.
  • a change point detection device 10 capable of detecting a change point will be explained.
  • the "system state” refers to the operating state of the system expressed by quantitative variables such as "number of accesses" and "number of users.”
  • FIG. 1 is a diagram showing an example of the functional configuration of a change point detection device.
  • the change point detection device 10 includes an input section 11, a time window generation section 12, a period setting section 13, a clustering section 14, a cluster transition sequence generation section 15, and a cluster transition tensor calculation section. 16, a change point score calculation section 17, a detection section 18, and an output section 19.
  • the "devices" in “Number of devices” and “Device status” shown below indicate devices constituting the system whose change point is to be detected by the change point detection device 10.
  • the input unit 11 is time-series data representing the system state at each point in time of a system (S) composed of one or more devices, and is expressed as (number of devices x device state) constituting the system (S). Input time series data consisting of dimension data (number of items).
  • the time window generation unit 12 divides the time series data inputted by the input unit 11 into time windows of fixed length, and converts the data at each point in time from data in the (number of devices x number of items) dimension to (number of devices x number of items x time window length) to generate converted data, and perform intermediate output.
  • the period setting unit 13 extracts time series data of a preset past period and current period from the (number of devices x number of items x time window length) dimension time series data generated by the time window generation unit 12. , performs intermediate output.
  • the clustering unit 14 uses a clustering method to classify the state of (number of devices x number of items x time window length) dimensional data at each point in time that constitutes the time series data of the past period and the current period extracted by the period setting unit 13. , performs intermediate output.
  • the cluster transition sequence creation unit 15 tracks the clusters assigned by the clustering unit 14 to the (number of devices x number of items x time window length) dimension data at each point in time in the past period and the current period in the time axis direction, and For each of the current period and the current period, a series of cluster transitions between different clusters is created, and at the same time, a period of stay in the cluster is assigned to each cluster constituting the cluster transition series, and intermediate output is performed.
  • the cluster transition tensor calculation unit 16 extracts cluster transitions of a preset fixed length from the cluster transition sequence created by the cluster transition sequence creation unit 15, and calculates the appearance of each cluster transition pattern in the past period and the current period. The probability is calculated, the cluster transition length (length of cluster transition) is taken as a rank (i.e., dimension), and the index of each dimension is the unique value of all the clusters that appeared in the past period and the current period, and the cluster transition pattern is calculated.
  • a cluster transition tensor whose value is the probability of occurrence of is calculated for each of the past period and the current period, and intermediate output is performed.
  • the change point score calculation unit 17 calculates the cluster transition tensor for the past period and the current period as the degree of change from the past period to the current period based on the cluster transition tensors for the past period and the current period calculated by the cluster transition tensor calculation unit 16. Calculate the distance of the cluster transition tensor in the period and perform intermediate output.
  • the detection unit 18 detects the change point as a change point when the change point score calculated by the change point score calculation unit 17 exceeds a preset threshold. That is, the detection unit 18 detects the change point as a change point when the change point score of the system state calculated based on the data (converted data) at each point in time exceeds a preset threshold.
  • the output unit 19 outputs the change point detected by the detection unit 18.
  • FIG. 2 is a flowchart illustrating an example of the change point detection process.
  • time will be expressed using N pieces of M ⁇ K dimensional data, where M is the number of devices configuring the system (S), K is the number of data items representing the system state at each point in time, and N is the number of observation points in time series data. It is assumed that series data is configured.
  • each element of the M ⁇ K dimensional data at each point in time is K observed values representing the states of M devices at that point in time.
  • M ⁇ K dimensional data at a certain point in time is expressed as [x1,...,xK,xK+1,...,x2K,...,x(M-1)K+1,...,xMK]
  • m 1, . . . , M, x(m ⁇ 1)K+1, .
  • Step S11 First, the input unit 11 inputs time series data composed of N pieces of M ⁇ K (number of devices ⁇ number of items) dimensional data. That is, if the M ⁇ K dimensional data at time n is Xn, the input unit 11 inputs time series data ⁇ X1, . . . , XN ⁇ .
  • Step S12 Next, the time window generation unit 12 divides the time series data input in step S11 into time windows of fixed length W, so that the data at each time point is divided into M ⁇ K (number of devices ⁇ number of items) dimensions.
  • the data is converted into M ⁇ K ⁇ W (number of devices ⁇ number of items ⁇ time window length) dimensional data to generate converted data and intermediate output is performed.
  • Step S14 Next, the clustering unit 14 configures time series data of the past period at the time of length (e1-s1+1) extracted at step S13 and the current period at the time of length (e2-s2+1) (e1- s1+e2-s2+2) M ⁇ K ⁇ W (number of devices ⁇ number of items ⁇ time window length) dimensional data is classified into states using a clustering method to obtain a cluster sequence corresponding to the time series data. Specifically, when the cluster to which the M ⁇ K ⁇ W dimensional data Yn of time n belongs is defined as Cn, the clustering unit 14 calculates the cluster sequence ⁇ Cs1 from the time series data ⁇ Ys1, ..., Ye1 ⁇ of the past period.
  • clustering is a process of classifying (e1-s1+e2-s2+2) pieces of M ⁇ K ⁇ W dimensional data into the same cluster based on their distances.
  • a cluster series is obtained by arranging clusters assigned to each M ⁇ K ⁇ W dimensional data in chronological order.
  • a clustering method a hierarchical method (for example, shortest distance method, longest distance method, group average method, Ward method, etc.) may be used, or a non-hierarchical method (for example, K-Means method, etc.) may be used. It's okay to be hit.
  • Step S15 Next, the cluster transition sequence creation unit 15 generates M ⁇ K ⁇ W (number of devices ⁇ number of items ⁇ time window length) dimensions at each point in the past period [s1, e1] and the current period [s2, e2].
  • the clusters assigned to the data in step S14 are tracked in the time axis direction, and cluster transition sequences between different clusters are created for each of the past period and the current period, and each cluster constituting this cluster transition sequence is The period of stay in the cluster is assigned.
  • the cluster sequence ⁇ Cs1, ..., Ce1 ⁇ obtained from the time series data ⁇ Ys1, ..., Ye1 ⁇ of the past period [s1, e1]
  • Step S16 Next, the cluster transition tensor calculation unit 16 extracts cluster transitions with a preset fixed length L from the cluster transition sequence created in step S15, and then Calculate the appearance probability of the pattern, take the cluster transition length L above as the rank (dimension), have the unique values of all clusters that appeared in the past period and the current period as indexes for each dimension, and calculate the appearance probability of the cluster transition pattern as the value.
  • the cluster transition tensor with is calculated for each of the past period and the current period.
  • the cluster transition sequence c( ⁇ 1) ⁇ c( ⁇ 2) ⁇ c( Taking ⁇ I) as an example, (I-(L-1)) cluster transitions of length L (L ⁇ I) can be extracted from this cluster transition sequence, and c( ⁇ i-(L- 1)) ⁇ c( ⁇ i-(L-2)) ⁇ ... ⁇ c( ⁇ i) (i L,...,I).
  • the cluster transition tensor calculating unit 16 calculates the appearance probability of these (I ⁇ (L ⁇ 1)) cluster transitions for each pattern, and calculates an L-dimensional cluster transition tensor based on this.
  • the appearance probability of a cluster transition pattern is a value obtained by dividing the frequency of appearance of the cluster transition pattern by the total frequency of appearance of all cluster transition patterns.
  • the frequency of appearance of a cluster transition pattern may be a value weighted by the stay period of the cluster transition pattern.
  • L 2 and the unique values of all clusters that have appeared in the past period and the current period are ⁇ , ⁇ , and ⁇ . explain.
  • the cluster transition tensor is two-dimensional, and the index of each dimension takes three values ⁇ , ⁇ , and ⁇ .
  • the cluster transition tensor can be represented by a 3 ⁇ 3 array, and if the probability of appearance of the cluster transition pattern ⁇ ⁇ ⁇ is 0.1, the index of the first axis (the first element of the cluster transition pattern) is the value ⁇ , An appearance probability of 0.1 is stored in an array element whose index on the second axis (second element of the cluster transition pattern) takes the value ⁇ .
  • Step S17 Next, the change point score calculation unit 17 calculates the cluster transition tensor in the past period as the degree of change from the past period to the current period based on the cluster transition tensors for the past period and the current period calculated in step S16. and calculate the distance of the cluster transition tensor in the current period. If the elements of the cluster transition tensor D1 in the past period are d1i1,..., iL, and the elements of the cluster transition tensor D2 in the current period are d2i1,..., iL, the distance between them is, for example, the following mean square It can be expressed by an error.
  • Step S18 Next, the detection unit 18 detects the change point score as a change point when the change point score calculated in step S17 exceeds a preset threshold. That is, the detection unit 18 detects the change point as a change point when the change point score of the system state calculated based on the data (converted data) at each point in time exceeds a preset threshold.
  • Step S19 Finally, the output unit 19 outputs the change point detected in step S18.
  • FIG. 3 is a hardware configuration diagram of the change point detection device.
  • the change point detection device 10 includes a processor 101, a memory 102, an auxiliary storage device 103, a connection device 104, a communication device 105, and a drive device 106. Note that each piece of hardware that constitutes the change point detection device 10 is interconnected via a bus 107.
  • the processor 101 plays the role of a control unit that controls the entire change point detection device 10, and includes various calculation devices such as a CPU (Central Processing Unit).
  • the processor 101 reads various programs onto the memory 102 and executes them.
  • the processor 101 may include GPGPU (General-purpose computing on graphics processing units).
  • the memory 102 includes main storage devices such as ROM (Read Only Memory) and RAM (Random Access Memory).
  • the processor 101 and the memory 102 form a so-called computer, and when the processor 101 executes various programs read onto the memory 102, the computer realizes various functions.
  • the auxiliary storage device 103 stores various programs and various information used when the various programs are executed by the processor 101.
  • connection device 104 is a connection device that connects an external device (for example, the display device 110, the operation device 111) and the change point detection device 10.
  • the communication device 105 is a communication device for transmitting and receiving various information to and from other devices.
  • the drive device 106 is a device for setting the recording medium 130.
  • the recording medium 130 herein includes a medium that records information optically, electrically, or magnetically, such as a CD-ROM (Compact Disc Read-Only Memory), a flexible disk, and a magneto-optical disk. Further, the recording medium 130 may include a semiconductor memory that electrically records information, such as a ROM (Read Only Memory) or a flash memory.
  • the various programs to be installed in the auxiliary storage device 103 are installed by, for example, setting the distributed recording medium 130 in the drive device 106 and reading out the various programs recorded on the recording medium 130 by the drive device 106. be done.
  • various programs installed in the auxiliary storage device 103 may be installed by being downloaded from a network via the communication device 105.
  • the change point detection device 10 uses time-series data representing the system state at each point in time of the system (S) composed of one or more devices, and detects when some change occurs in the system state. It is possible to detect the point of occurrence as a point of change.
  • the change point detection device 10 is based on a method of classifying the system state at each point in time using a clustering method, time-series data including data that does not satisfy the stationarity constraint or the IID constraint, such as data that shows periodic fluctuations, can be targeted. Furthermore, the change point detection device 10 detects periodic fluctuations in the system (S) by considering state transitions of the system (S) over time (that is, transitions in the cluster to which the system state belongs and its stay period at each point in time). It is possible to detect changes, including changes in time-varying patterns, such as changes in periodic fluctuations.
  • change point score calculation device 20 which will be described later, has the same hardware configuration as the change point detection device 10, so a description thereof will be omitted.
  • ⁇ Change point score calculation device when the change point detection device 10 calculates the distance between cluster transition tensors in the past period and the current period in the change point score calculation unit 17, the change point detection device 10 calculates the stay probability for each cluster transition pattern in the past period and the current period. By weighting the squared error by the distance of each cluster transition pattern from other patterns, we can calculate whether a cluster transition pattern is newly observed that is far from what has been observed in the past, or is far from what is currently observed.
  • a change point score calculation device 20 that is capable of realizing more precise change point detection by introducing a mechanism to encourage an increase in the change point score when a cluster transition pattern has been observed in the past will be described.
  • FIG. 3 is a diagram showing an example of the functional configuration of the change point score calculation device according to the present embodiment.
  • the change point score calculation device 20 includes a centroid coordinate input section 21, a cluster transition pattern input section 22, a cluster transition tensor input section 23, and an inter-centroid distance matrix calculation. , a cluster transition pattern distance matrix calculation section 25 , a change point score calculation section 26 , and an output section 27 .
  • the centroid coordinate input unit 21 is used to allocate all the clusters generated by the clustering unit 14 (data in the dimensions of number of devices x number of items x time window length at each point in time that constitutes the time series data of the past period and the current period). Enter the centroid (cluster center) coordinates of all clusters that were created.
  • the cluster transition pattern input unit 22 inputs all cluster transition patterns (all cluster transition patterns that appeared in the past period and the current period) extracted by the cluster transition tensor calculation unit 16.
  • the cluster transition tensor input unit 23 inputs the cluster transition tensors for each of the past period and the current period calculated by the cluster transition tensor calculation unit 16.
  • the inter-centroid distance matrix calculation unit 24 calculates the inter-centroid distance matrices for all cluster pairs based on the centroid coordinates of all clusters input by the centroid coordinate input unit 21, and performs intermediate output.
  • the inter-cluster transition pattern distance matrix calculation unit 25 calculates the inter-centroid distances for all cluster transition patterns input by the cluster transition pattern input unit 22 and all cluster pairs calculated by the inter-centroid distance matrix calculation unit 24. Based on the matrix, distance matrices for all cluster transition pattern pairs are calculated and intermediate output is performed.
  • the change point score calculation unit 26 calculates the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input unit 23 and all the cluster transition pattern pairs calculated by the inter-cluster transition pattern distance matrix calculation unit 25. Based on the distance matrix, the distance between cluster transition tensors for the past period and the current period is calculated, taking into account the distance between cluster transition patterns, and intermediate output is performed. Note that the change point score calculation section 26 is a functional section that shows one aspect of the function of the change point score calculation section 17.
  • the output unit 27 outputs the distance between cluster transition tensors between the past period and the current period calculated by the change point score calculation unit 26 as a change point score, and passes it to the detection unit 18.
  • FIG. 4 is a flowchart illustrating an example of the change point score calculation process according to the present embodiment.
  • Step S13 The cluster transition tensor input unit 23 inputs the cluster transition tensors for each of the past period and the current period calculated by the cluster transition tensor calculation unit 16. That is, if the cluster transition tensors calculated for the past period and the current period are respectively D 1 and D 2 , the cluster transition tensor input unit 23 inputs these two tensors D 1 and D 2 .
  • the cluster transition tensor here means that both D 1 and D 2 have the length L of the cluster transition pattern as the rank (number of dimensions), and each unique value of all clusters that appeared in the past period and the current period is It is a tensor that has indexes of dimensions, and elements corresponding to combinations of indexes (clusters) of each dimension store the stay probabilities of cluster transition patterns in which the indexes (clusters) of each dimension are arranged in order of dimension.
  • the Euclidean distance expressed by the following formula may be used, or other distances (Manhattan distance, Chebyshev distance, Mahalanobis distance, etc.) may be used.
  • Step S25 The inter-cluster transition pattern distance matrix calculation unit 25 calculates the centroids for all cluster transition patterns input by the cluster transition pattern input unit 22 and all cluster pairs calculated by the inter-centroid distance matrix calculation unit 24. Distance matrices for all cluster transition pattern pairs are calculated based on the interroid distance matrix.
  • the inter-roid distance is d c (X i , X j )
  • the stored distance matrix M p of M rows and M columns is determined.
  • alignment technology which is a technology in the bioinformatics field, is used.
  • Alignment technology is a technology that can be used to determine the similarity between two or more sequences, and it seeks the optimal correspondence between sequences by inserting gap symbols to make the sequence lengths the same. It is a technique for solving optimization problems. In step S25, in particular, a pairwise alignment technique is used to find the optimal correspondence between two sequences.
  • What is optimized (maximized) in pairwise alignment technology is the score calculated from the correspondence between two sequences, and when the matched character pairs are the same, different, or one has a gap A score is set in advance for each, and the score is calculated by summing the scores.
  • step S25 by using this pairwise alignment technology, it is possible to find the optimal correspondence between the two cluster transition patterns and calculate the maximum score.
  • the distance between two cluster transition patterns can be calculated from this similarity. For example, the distance can be calculated by normalizing the similarity in the interval [0, 1] and then subtracting it from 1.
  • step S25 the alignment Since the characters constituting the (cluster transition pattern) correspond to clusters, the inter-centroid distance calculated in step S24 is used as the score set for different character pairs (cluster pairs). For example, if the score set for the same character pair (cluster pair) is +s and the score set when one is a gap is -g, then the minimum value of the inter-centroid distance matrix M c is +s and the maximum value - Normalize it so that it becomes g.
  • the minimum value of the inter-centroid distance matrix M c corresponds to the distance between the same cluster pair, so this is associated with the score + s set for the same character pair (cluster pair), and the inter-centroid distance Since the maximum value of the matrix M c corresponds to the distance between the farthest cluster pair, this means that it is associated with the score -g when one of the clusters is a gap.
  • Step S26 The change point score calculation unit 26 calculates the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input unit 23, and all cluster transitions calculated by the cluster transition pattern distance matrix calculation unit 25. Based on the distance matrix for pattern pairs, distances between cluster transition tensors in the past period and current period are calculated, taking into account the distances between cluster transition patterns.
  • the distance d (D 1 , D 2 ) between cluster transition tensors between the past period and the current period, taking into account the distance between cluster transition patterns, is calculated, for example, by the following equation.
  • the m-th element ⁇ m means the average distance of the cluster transition pattern ⁇ m from other patterns, and is calculated by the following formula (N is the number of columns of the distance matrix M (means the number of cluster transition patterns).
  • Step S27 The output unit 27 outputs the distance d (D 1 , D 2 ) between cluster transition tensors between the past period and the current period calculated by the change point score calculation unit 26 as a change point score, and passes it to the detection unit 8. .
  • the change point score calculation device 20 when the change point score calculation unit 17 calculates the distance between cluster transition tensors between the past period and the current period, the change point score calculation device 20 according to the present embodiment By weighting the squared error of the stay probability for each pattern by the distance of each cluster transition pattern from other patterns, it is possible to detect cases where a new cluster transition pattern is observed that is far from those observed in the past, or when a cluster transition pattern that is currently observed is If a cluster transition pattern that is far different from the current one has been observed in the past, a mechanism can be introduced to encourage the change point score to increase, allowing for more precise change point detection.
  • the present invention is not limited to the above-described embodiments, and may have the following configuration or processing (operation).
  • the change point detection device 10 and the change point score calculation device 20 can be realized by a computer and a program, but this program can also be recorded on a (non-temporary) recording medium or provided through a network such as the Internet. be.
  • Change point detection device 11 Input section 12 Time window generation section 13 Period setting section 14 Clustering section 15 Cluster transition series creation section 16 Cluster transition tensor calculation section 17 Change point score calculation section 18 Detection section 19 Output section 20 Change point score calculation device 21 Centroid coordinate input section 22 Cluster transition pattern input section 23 Cluster transition tensor input section 24 Inter-centroid distance matrix calculation section 25 Inter-cluster transition pattern distance matrix calculation section 26 Change point score calculation section 27 Output section

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

A change point score calculation device according to the present disclosure: enters centroid coordinates, which are the centers of all clusters assigned to data at each time point that has dimension equal to the number of devices × the number of items × a time window length and constitutes time series data for a past period and the current period; enters all cluster transition patterns appearing in the past period and the current period; enters the cluster transition tensor for each of the past period and the current period; calculates the inter-centroid distance matrix for all cluster pairs on the basis of the centroid coordinates of all clusters; calculates the distance matrix for all cluster transition pattern pairs, on the basis of all cluster transition patterns and the inter-centroid distance matrix for all cluster pairs; and calculates the distance between the cluster transition tensors for the past period and the current period on the basis of the cluster transition tensor for each of the past period and the current period and the distance matrix for all cluster transition pattern pairs, taking into account the distance between the cluster transition patterns.

Description

変化点スコア算出装置、変化点スコア算出方法、及びプログラムChange point score calculation device, change point score calculation method, and program
 本開示内容は、距離を考慮した変化点スコアの算出に関する。 The present disclosure relates to calculating a change point score in consideration of distance.
 1つ又は複数の装置で構成されるシステムの各時点におけるシステム状態を表す時系列データを用いて、当該システムのシステム状態の変化点を検知する技術が従来から知られている。ここで、「システム状態」とは、「アクセス数」や「ユーザ数」などの量的変数により表されるシステムの稼働状態のことである。 A technique is conventionally known that uses time-series data representing the system state at each point in time of a system composed of one or more devices to detect a change point in the system state of the system. Here, the "system state" refers to the operating state of the system expressed by quantitative variables such as "number of accesses" and "number of users."
 変化点の発生位置に関する正解ラベルが付与されていない時系列データを対象に変化点を検知する技術としては、非特許文献1に記載されている技術が知られている。 A technique described in Non-Patent Document 1 is known as a technique for detecting a change point in time-series data to which no correct label regarding the position of the change point is attached.
 非特許文献1の方法は、クラスタリングにより変化点を検知する技術を拡張した方法であり、クラスタリングベースの方法であるため、対象時系列が定常性制約、独立同分布制約などの制約を受けることはない。また、非特許文献1の方法は、時系列データの各時点における時間窓をクラスタリングした後、各時点に割り当てられたクラスタを時間軸方向に追跡し遷移パターンを抽出している点で、時間軸の概念を導入した方法と言える。さらに、非特許文献1の方法は、時間窓よりも十分に長い過去期間と現在期間を設定し、両期間におけるクラスタ遷移パターンの分布を比較して変化点スコアを算出するという、時点ごとのスナップショットデータではなく一定の時間幅を持った区間データに対し変化点スコアを算出している点で、時系列パターンの変化も含めて変化点検知が可能な手法である。 The method of Non-Patent Document 1 is an expanded method of detecting change points by clustering, and since it is a clustering-based method, the target time series is not subject to constraints such as stationarity constraints and independent and equal distribution constraints. do not have. In addition, the method of Non-Patent Document 1 clusters the time windows at each point in time series data, and then tracks the clusters assigned to each point in the time axis direction to extract transition patterns. It can be said that this method introduced the concept of Furthermore, the method of Non-Patent Document 1 sets a past period and a current period that are sufficiently longer than the time window, and calculates a change point score by comparing the distribution of cluster transition patterns in both periods. This method is capable of detecting change points, including changes in time-series patterns, because the change point score is calculated for interval data with a fixed time width rather than shot data.
 非特許文献1が提案する変化点検知装置は、具体的には、1つ又は複数の装置で構成されるシステムの各時点におけるシステム状態を表す時系列データであって、前記システムを構成する装置数×前記装置の状態を表す項目数の次元のデータで構成される時系列データを入力する入力部と、前記各時点における前記時系列データを装置数×項目数の次元のデータから、装置数×項目数×時間窓長の次元のデータに変換することで変換データを生成する時間窓生成部と、前記各時点における前記変換データに基づいて算出された前記システム状態の変化点スコアがあらかじめ設定された閾値を超えた場合に変化点として検知する機能(クラスタリング部、クラスタ遷移系列作成部、クラスタ遷移テンソル算出部、変化点スコア算出部、検知部)を有する。 Specifically, the change point detection device proposed in Non-Patent Document 1 is time-series data representing the system state at each point in time of a system composed of one or more devices, and the device constituting the system. an input unit for inputting time-series data consisting of data in a dimension of number x number of items representing the state of the device; A time window generation unit that generates converted data by converting it into data with dimensions of x number of items x time window length, and a change point score of the system state calculated based on the converted data at each point in time are set in advance. It has a function (a clustering unit, a cluster transition sequence creation unit, a cluster transition tensor calculation unit, a change point score calculation unit, and a detection unit) to detect a change point when the threshold value is exceeded.
 また、変化点スコア算出部は、変化点スコアすなわち過去期間と現在期間のクラスタ遷移テンソル間距離の計算方法として、以下の平均平方二乗誤差により実行する。
dist(D,D)=(Σl=1 Σm=1 (d c1,・・・,cL-d c1,・・・,cL/M1/2
 ここで、D、Dはそれぞれ過去期間、現在期間におけるクラスタ遷移テンソル、d c1,・・・,cL、d c1,・・・,cLはそれぞれクラスタ遷移パターン{c1,c2,・・・,cL}の滞在確率が格納されているテンソルD、Dの要素である。
Further, the change point score calculation unit executes the following mean square error as a method of calculating the change point score, that is, the distance between cluster transition tensors between the past period and the current period.
dist(D 1 , D 2 )=(Σ l=1 L Σ m=1 M (d 2 c1,..., cL - d 1 c1,..., cL ) 2 /M L ) 1/2
Here, D 1 and D 2 are cluster transition tensors in the past period and current period, respectively, and d 1 c1,..., cL and d 2 c1,..., cL are cluster transition patterns {c1, c2,... . . , cL} are elements of tensors D 1 and D 2 that store the stay probabilities.
 しかし、上記のテンソル間距離計算方法では、例えば、過去期間に観測されておらず現在期間に新たに観測されたクラスタ遷移パターンが存在する場合に、新たに観測されたクラスタ遷移パターンが過去観測されていた他のクラスタ遷移パターンからかけ離れたものであっても、似ているものであっても、距離計算上は同じ結果となってしまう。逆に、過去期間に観測されたものの現在期間に観測されていないクラスタ遷移パターンが存在する場合も、過去観測されていたクラスタ遷移パターンが現在観測されている他のクラスタ遷移パターンからかけ離れたものであっても、似ているものであっても、やはり距離計算上は同じ結果となってしまう。過去観測されたものとはかけ離れたクラスタ遷移パターンが新たに観測された場合、もしくは現在観測されているものとはかけ離れたクラスタ遷移パターンが過去観測されていた場合、変化点スコア、すなわち過去期間と現在期間のクラスタ遷移テンソル間距離が上昇することが望ましいが、上記のテンソル間距離計算方法ではこのようなクラスタ遷移パターン間のかけ離れ度合い(距離)を考慮することができないという課題が生じる。 However, with the above inter-tensor distance calculation method, for example, if there is a cluster transition pattern that has not been observed in the past period but is newly observed in the current period, the newly observed cluster transition pattern is not observed in the past. Even if the cluster transition pattern is far away from, or similar to, other cluster transition patterns, the distance calculation will give the same result. Conversely, if there is a cluster transition pattern that was observed in the past period but not in the current period, the cluster transition pattern observed in the past is far different from other cluster transition patterns currently observed. Even if they are similar, the distance calculation will still yield the same result. If a new cluster transition pattern is observed that is far from what has been observed in the past, or if a cluster transition pattern that is far from what is currently observed is observed in the past, the change point score, that is, the past period. Although it is desirable that the distance between cluster transition tensors in the current period increases, a problem arises in that the above-described inter-tensor distance calculation method cannot take into account the degree of separation (distance) between cluster transition patterns.
 本発明は、上記の点を鑑みてなされたものであり、過去観測されたものとはかけ離れたクラスタ遷移パターンが新たに観測された場合や、現在観測されているものとはかけ離れたクラスタ遷移パターンが過去観測されていた場合に、変化点スコアの上昇を促す仕組みを導入し、より精緻な変化点検知を実現することを目的とする。 The present invention has been made in view of the above points, and is applicable to cases where a cluster transition pattern that is far from what has been observed in the past is newly observed, or a cluster transition pattern that is far from what is currently observed. The aim is to introduce a mechanism that promotes an increase in the change point score when a change point has been observed in the past, and to achieve more precise change point detection.
 上記目的を達成するため、請求項1に係る発明は、クラスタリング部により生成された、過去期間と現在期間の時系列データを構成する各時点の装置数×項目数×時間窓長の次元のデータに対し割り当てられた全てのクラスタの中心であるセントロイド座標を入力するセントロイド座標入力部と、クラスタ遷移テンソル算出部において抽出された、過去期間と現在期間に出現した全てのクラスタ遷移パターンを入力するクラスタ遷移パターン入力部と、前記クラスタ遷移テンソル算出部において算出された、過去期間と現在期間それぞれのクラスタ遷移テンソルを入力するクラスタ遷移テンソル入力部と、前記セントロイド座標入力部により入力された全てのクラスタの前記セントロイド座標に基づき、全てのクラスタペアについてのセントロイド間距離行列を算出するセントロイド間距離行列算出部と、前記クラスタ遷移パターン入力部により入力された全ての前記クラスタ遷移パターンと、前記セントロイド間距離行列算出部により算出された全ての前記クラスタペアについてのセントロイド間距離行列に基づき、全ての前記クラスタ遷移パターンのペアについての距離行列を算出するクラスタ遷移パターン間距離行列算出部と、前記クラスタ遷移テンソル入力部により入力された過去期間と現在期間それぞれの前記クラスタ遷移テンソルと、クラスタ遷移パターン間距離行列算出部により算出された全ての前記クラスタ遷移パターンのペアについての距離行列に基づき、前記クラスタ遷移パターンの間の距離を考慮して、過去期間と現在期間の前記クラスタ遷移テンソルの間の距離を算出する変化点スコア算出部と、を有する変化点スコア算出装置である。 In order to achieve the above object, the invention according to claim 1 provides data in the dimensions of number of devices x number of items x time window length at each point in time that constitutes time series data of the past period and the current period, generated by the clustering unit. The centroid coordinate input section inputs the centroid coordinates that are the centers of all clusters assigned to , and the cluster transition tensor calculation section inputs all cluster transition patterns that appeared in the past and current periods, extracted by the cluster transition tensor calculation section. a cluster transition pattern input section that inputs the cluster transition tensors for the past period and the current period calculated in the cluster transition tensor calculation section; an inter-centroid distance matrix calculation unit that calculates an inter-centroid distance matrix for all cluster pairs based on the centroid coordinates of the clusters; and all the cluster transition patterns input by the cluster transition pattern input unit. , an inter-cluster transition pattern distance matrix calculation that calculates a distance matrix for all the cluster transition pattern pairs based on the inter-centroid distance matrix for all the cluster pairs calculated by the inter-centroid distance matrix calculation unit. , the cluster transition tensors for each of the past and current periods input by the cluster transition tensor input unit, and a distance matrix for all pairs of cluster transition patterns calculated by the inter-cluster transition pattern distance matrix calculation unit. a change point score calculation unit that calculates a distance between the cluster transition tensors of a past period and a current period, taking into account the distance between the cluster transition patterns, based on the above.
 以上説明したように本発明によれば、過去観測されたものとはかけ離れたクラスタ遷移パターンが新たに観測された場合や、現在観測されているものとはかけ離れたクラスタ遷移パターンが過去観測されていた場合に、変化点スコアの上昇を促す仕組みを導入し、より精緻な変化点検知を実現することができるという効果を奏する。 As explained above, according to the present invention, a cluster transition pattern that is far different from that observed in the past is newly observed, or a cluster transition pattern that is far different from what is currently observed is observed in the past. In this case, a mechanism is introduced to encourage an increase in the change point score, which has the effect of realizing more precise change point detection.
本実施形態の前提となる変化点検知装置の機能構成の一例を示す図である。It is a figure showing an example of the functional composition of the change point detection device which is a premise of this embodiment. 変化点の検知処理を示すフローチャートである。7 is a flowchart illustrating a process of detecting a change point. 本実施形態に係る距離を考慮した変化点スコア算出装置の機能構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of a change point score calculation device that takes distance into consideration according to the present embodiment. 本実施形態に係る距離を考慮した変化点スコア算出処理の一例を示すフローチャートである。12 is a flowchart illustrating an example of a change point score calculation process that takes distance into consideration according to the present embodiment. 変化点検知装置、変化点スコア算出装置のハードウェア構成の一例を示す図である。It is a figure showing an example of the hardware constitutions of a change point detection device and a change point score calculation device.
 ●本実施形態の前提となる変化点検知装置
 まず、本実施形態の変化点スコア算出装置20を説明する前に、図1、図2、及び図5を用いて、本実施形態の前提となる変化点検知装置10について説明する。なお、変化点検知装置10及び変化点スコア算出装置20は、同じ装置であり、各特徴の一面を示した名称が付されただけである。
- Change point detection device that is the premise of this embodiment First, before explaining the change point score calculation device 20 of this embodiment, using FIG. 1, FIG. 2, and FIG. The change point detection device 10 will be explained. Note that the change point detection device 10 and the change point score calculation device 20 are the same device, and are simply given names that indicate one aspect of each feature.
 ここでは、1つ又は複数の装置で構成されるシステム(S)の各時点におけるシステム状態を表す時系列データを用いて、システム状態に何らかの変化が発生した際に、その発生時点を変化点として検知することが可能な変化点検知装置10について説明する。ここで、「システム状態」とは、「アクセス数」や「ユーザ数」などの量的変数により表されるシステムの稼働状態のことである。 Here, we use time-series data representing the system state at each point in time of a system (S) consisting of one or more devices, and when some change occurs in the system state, we use the point of occurrence as a change point. A change point detection device 10 capable of detecting a change point will be explained. Here, the "system state" refers to the operating state of the system expressed by quantitative variables such as "number of accesses" and "number of users."
 〔機能構成〕
 まず、変化点検知装置10の機能構成について、図1を参照しながら説明する。図1は、変化点検知装置の機能構成の一例を示す図である。
[Functional configuration]
First, the functional configuration of the change point detection device 10 will be explained with reference to FIG. FIG. 1 is a diagram showing an example of the functional configuration of a change point detection device.
 図1に示すように、変化点検知装置10は、入力部11と、時間窓生成部12と、期間設定部13と、クラスタリング部14と、クラスタ遷移系列作成部15と、クラスタ遷移テンソル算出部16と、変化点スコア算出部17と、検知部18と、出力部19とを有する。なお、以下に示されている「装置数」及び「装置の状態」の「装置」は変化点検知装置10によって変化点の検知対象となるシステムを構成する装置を示す。 As shown in FIG. 1, the change point detection device 10 includes an input section 11, a time window generation section 12, a period setting section 13, a clustering section 14, a cluster transition sequence generation section 15, and a cluster transition tensor calculation section. 16, a change point score calculation section 17, a detection section 18, and an output section 19. Note that the "devices" in "Number of devices" and "Device status" shown below indicate devices constituting the system whose change point is to be detected by the change point detection device 10.
 入力部11は、1つ又は複数の装置で構成されるシステム(S)の各時点におけるシステム状態を表す時系列データであって、システム(S)を構成する(装置数×装置の状態を表す項目数)次元のデータで構成される時系列データを入力する。 The input unit 11 is time-series data representing the system state at each point in time of a system (S) composed of one or more devices, and is expressed as (number of devices x device state) constituting the system (S). Input time series data consisting of dimension data (number of items).
 時間窓生成部12は、入力部11により入力された時系列データを固定長の時間窓で区切り、各時点のデータを(装置数×項目数)次元のデータから、(装置数×項目数×時間窓長)次元のデータに変換して変換データを生成し、中間出力を行う。 The time window generation unit 12 divides the time series data inputted by the input unit 11 into time windows of fixed length, and converts the data at each point in time from data in the (number of devices x number of items) dimension to (number of devices x number of items x time window length) to generate converted data, and perform intermediate output.
 期間設定部13は、時間窓生成部12により生成された(装置数×項目数×時間窓長)次元の時系列データから、あらかじめ設定された過去期間と現在期間の時系列データを抽出して、中間出力を行う。 The period setting unit 13 extracts time series data of a preset past period and current period from the (number of devices x number of items x time window length) dimension time series data generated by the time window generation unit 12. , performs intermediate output.
 クラスタリング部14は、期間設定部13により抽出された過去期間と現在期間の時系列データを構成する各時点の(装置数×項目数×時間窓長)次元のデータをクラスタリング手法により状態分類して、中間出力を行う。 The clustering unit 14 uses a clustering method to classify the state of (number of devices x number of items x time window length) dimensional data at each point in time that constitutes the time series data of the past period and the current period extracted by the period setting unit 13. , performs intermediate output.
 クラスタ遷移系列作成部15は、過去期間と現在期間の各時点における(装置数×項目数×時間窓長)次元のデータに対しクラスタリング部14が割り当てたクラスタを時間軸方向に追跡し、過去期間と現在期間のそれぞれについて、異なるクラスタ間におけるクラスタ遷移の系列を作成すると同時に、このクラスタ遷移系列を構成する各クラスタに対し当該クラスタにおける滞在期間を付与して、中間出力を行う。 The cluster transition sequence creation unit 15 tracks the clusters assigned by the clustering unit 14 to the (number of devices x number of items x time window length) dimension data at each point in time in the past period and the current period in the time axis direction, and For each of the current period and the current period, a series of cluster transitions between different clusters is created, and at the same time, a period of stay in the cluster is assigned to each cluster constituting the cluster transition series, and intermediate output is performed.
 クラスタ遷移テンソル算出部16は、クラスタ遷移系列作成部15により作成されたクラスタ遷移系列から、あらかじめ設定された固定長のクラスタ遷移を抽出した上で、過去期間と現在期間における各クラスタ遷移パターンの出現確率を算出し、上記クラスタ遷移長(クラスタ遷移の長さ)を階数(即ち、次元)とし、過去期間と現在期間に出現した全てのクラスタのユニーク値を各次元のインデックスに持ち、クラスタ遷移パターンの出現確率を値に持つクラスタ遷移テンソルを過去期間と現在期間のそれぞれについて算出して、中間出力を行う。 The cluster transition tensor calculation unit 16 extracts cluster transitions of a preset fixed length from the cluster transition sequence created by the cluster transition sequence creation unit 15, and calculates the appearance of each cluster transition pattern in the past period and the current period. The probability is calculated, the cluster transition length (length of cluster transition) is taken as a rank (i.e., dimension), and the index of each dimension is the unique value of all the clusters that appeared in the past period and the current period, and the cluster transition pattern is calculated. A cluster transition tensor whose value is the probability of occurrence of is calculated for each of the past period and the current period, and intermediate output is performed.
 変化点スコア算出部17は、クラスタ遷移テンソル算出部16により算出された過去期間と現在期間それぞれのクラスタ遷移テンソルに基づき、過去期間から現在期間にかけての変化度として、過去期間におけるクラスタ遷移テンソルと現在期間におけるクラスタ遷移テンソルの距離を算出して、中間出力を行う。 The change point score calculation unit 17 calculates the cluster transition tensor for the past period and the current period as the degree of change from the past period to the current period based on the cluster transition tensors for the past period and the current period calculated by the cluster transition tensor calculation unit 16. Calculate the distance of the cluster transition tensor in the period and perform intermediate output.
 検知部18は、変化点スコア算出部17により算出された変化点スコアがあらかじめ設定された閾値を超えた場合に変化点として検知する。即ち、検知部18は、各時点におけるデータ(変換データ)に基づいて算出されたシステム状態の変化点スコアがあらかじめ設定された閾値を超えた場合に変化点として検知する。 The detection unit 18 detects the change point as a change point when the change point score calculated by the change point score calculation unit 17 exceeds a preset threshold. That is, the detection unit 18 detects the change point as a change point when the change point score of the system state calculated based on the data (converted data) at each point in time exceeds a preset threshold.
 出力部19は、検知部18により検知された変化点を出力する。 The output unit 19 outputs the change point detected by the detection unit 18.
 〔変化点検知処理〕
 次に、変化点検知処理(手順)について、図2を参照しながら説明する。図2は、変化点検知処理の一例を示すフローチャートである。
[Change point detection processing]
Next, the change point detection process (procedure) will be explained with reference to FIG. FIG. 2 is a flowchart illustrating an example of the change point detection process.
 以降では、システム(S)を構成する装置数をM、各時点におけるシステム状態を表すデータの項目数をK、時系列データの観測時点数をNとして、N個のM×K次元データで時系列データが構成されているものとする。 In the following, time will be expressed using N pieces of M×K dimensional data, where M is the number of devices configuring the system (S), K is the number of data items representing the system state at each point in time, and N is the number of observation points in time series data. It is assumed that series data is configured.
 なお、各時点におけるM×K次元データの各要素は、当該時点におけるM個の装置の状態を表すK個の観測値である。具体的には、或る時点におけるM×K次元データを[x1,・・・,xK,xK+1,・・・,x2K,・・・,x(M-1)K+1,・・・,xMK]とした場合、例えば、m=1,・・・,Mに対してx(m-1)K+1,・・・,xmKは当該時点におけるm番目の装置のK個の観測値である。 Note that each element of the M×K dimensional data at each point in time is K observed values representing the states of M devices at that point in time. Specifically, M×K dimensional data at a certain point in time is expressed as [x1,...,xK,xK+1,...,x2K,...,x(M-1)K+1,...,xMK] For example, for m=1, . . . , M, x(m−1)K+1, .
 ステップS11:まず、入力部11は、N個のM×K(装置数×項目数)次元データで構成される時系列データを入力する。すなわち、時点nにおけるM×K次元データをXnとすれば、入力部11は、時系列データ{X1,・・・,XN}を入力する。 Step S11: First, the input unit 11 inputs time series data composed of N pieces of M×K (number of devices×number of items) dimensional data. That is, if the M×K dimensional data at time n is Xn, the input unit 11 inputs time series data {X1, . . . , XN}.
 ステップS12:次に、時間窓生成部12は、ステップS11で入力された時系列データを固定長Wの時間窓で区切ることによって、各時点のデータをM×K(装置数×項目数)次元のデータからM×K×W(装置数×項目数×時間窓長)次元のデータに変換して変換データを生成し、中間出力を行う。具体的には、時点n-(W-1),n-(W-2),・・・,nそれぞれのM×K次元データXn-(W-1),Xn-(W-2),・・・,Xnで構成されるM×K×W次元ベクトルYn=(Xn-(W-1),Xn-(W-2),・・・,Xn)を時点nのM×K×W次元データとする。なお、元のM×K次元データXnが時点n=1,・・・,Nについて観測されている場合、変換後のM×K×W次元データYnは時点n=W,・・・,Nについて得られることになる。 Step S12: Next, the time window generation unit 12 divides the time series data input in step S11 into time windows of fixed length W, so that the data at each time point is divided into M×K (number of devices×number of items) dimensions. The data is converted into M×K×W (number of devices×number of items×time window length) dimensional data to generate converted data and intermediate output is performed. Specifically, M×K dimensional data Xn-(W-1), Xn-(W-2), ..., Xn is an M×K×W dimensional vector Yn=(Xn-(W-1), Let it be dimensional data. Note that if the original M×K dimensional data Xn is observed at time n=1,...,N, the converted M×K×W dimensional data Yn is observed at time n=W,...,N You will get about.
 ステップS13:次に、期間設定部13は、ステップS12で生成されたM×K×W(装置数×項目数×時間窓長)次元の時系列データから、あらかじめ設定された過去期間と現在期間の時系列データを抽出する。具体的には、過去期間を[s1,e1]、現在期間を[s2,e2]とした場合、時点n=W,・・・,NのM×K×W次元データYnから過去期間のデータ{Ys1,・・・,Ye1}と現在期間のデータ{Ys2,・・・,Ye2}を抽出する。 Step S13: Next, the period setting unit 13 sets a preset past period and current period from the M×K×W (number of devices x number of items x time window length) dimension time series data generated in step S12. Extract time series data. Specifically, when the past period is [s1, e1] and the current period is [s2, e2], data of the past period is obtained from M×K×W dimensional data Yn at time n=W,...,N. {Ys1, . . . , Ye1} and current period data {Ys2, . . . , Ye2} are extracted.
 ステップS14:次に、クラスタリング部14は、ステップS13で抽出された長さ(e1-s1+1)時点の過去期間と長さ(e2-s2+1)時点の現在期間の時系列データを構成する(e1-s1+e2-s2+2)個のM×K×W(装置数×項目数×時間窓長)次元データをクラスタリング手法により状態分類することで、当該時系列データに対応するクラスタ系列を得る。具体的には、クラスタリング部14は、時点nのM×K×W次元データYnが属するクラスタをCnとした場合、過去期間の時系列データ{Ys1,・・・,Ye1}からクラスタ系列{Cs1,・・・,Ce1}、現在期間の時系列データ{Ys2,・・・,Ye2}からクラスタ系列{Cs2,・・・,Ce2}が得られる。なお、クラスタリングは、(e1-s1+e2-s2+2)個のM×K×W次元データを互いの距離に基づいて近いデータ同士を同一クラスタに分類する処理である。各M×K×W次元データに割り当てられたクラスタを時系列順に並べることでクラスタ系列が得られる。クラスタリング手法としては、階層的手法(例えば、最短距離法、最長距離法、群平均法、ウォード法等)が用いられてもよいし、非階層的手法(例えば、K-Means法等)が用いられてもよい。 Step S14: Next, the clustering unit 14 configures time series data of the past period at the time of length (e1-s1+1) extracted at step S13 and the current period at the time of length (e2-s2+1) (e1- s1+e2-s2+2) M×K×W (number of devices×number of items×time window length) dimensional data is classified into states using a clustering method to obtain a cluster sequence corresponding to the time series data. Specifically, when the cluster to which the M×K×W dimensional data Yn of time n belongs is defined as Cn, the clustering unit 14 calculates the cluster sequence {Cs1 from the time series data {Ys1, ..., Ye1} of the past period. , ..., Ce1}, and the cluster sequence {Cs2, ..., Ce2} is obtained from the time series data {Ys2, ..., Ye2} of the current period. Note that clustering is a process of classifying (e1-s1+e2-s2+2) pieces of M×K×W dimensional data into the same cluster based on their distances. A cluster series is obtained by arranging clusters assigned to each M×K×W dimensional data in chronological order. As a clustering method, a hierarchical method (for example, shortest distance method, longest distance method, group average method, Ward method, etc.) may be used, or a non-hierarchical method (for example, K-Means method, etc.) may be used. It's okay to be hit.
 ステップS15:次に、クラスタ遷移系列作成部15は、過去期間[s1,e1]と現在期間[s2,e2]の各時点におけるM×K×W(装置数×項目数×時間窓長)次元データに対しステップS14で割り当てたクラスタを時間軸方向に追跡し、過去期間と現在期間のそれぞれについて、異なるクラスタ間におけるクラスタ遷移の系列を作成すると共に、このクラスタ遷移系列を構成する各クラスタに対し当該クラスタにおける滞在期間を付与する。具体的に過去期間[s1,e1]の時系列データ{Ys1,・・・,Ye1}から得られたクラスタ系列{Cs1,・・・,Ce1}を例にとり説明すると、区間[s1,e1]で異なるクラスタ間におけるクラスタ遷移が発生した時点をτi(i=1,2,・・・,I)(ただし、τ1=s1)、時点τiにおける遷移後クラスタのクラスタをc(τi)とした場合、これを時系列順に並べることで長さIのクラスタ遷移系列c(τ1)→c(τ2)→・・・→c(τI)が得られる。また、このクラスタ遷移系列を構成する各クラスタc(τi)に対し、当該クラスタc(τi)における滞在期間d(τi)=τi+1-τi(ただし、τI+1=e1)を付与することにより、滞在期間つきクラスタ遷移系列c(τ1)|d(τ1)→c(τ2)|d(τ2)→・・・→c(τI)|d(τI)が得られる。 Step S15: Next, the cluster transition sequence creation unit 15 generates M×K×W (number of devices×number of items×time window length) dimensions at each point in the past period [s1, e1] and the current period [s2, e2]. The clusters assigned to the data in step S14 are tracked in the time axis direction, and cluster transition sequences between different clusters are created for each of the past period and the current period, and each cluster constituting this cluster transition sequence is The period of stay in the cluster is assigned. Specifically, taking as an example the cluster sequence {Cs1, ..., Ce1} obtained from the time series data {Ys1, ..., Ye1} of the past period [s1, e1], the interval [s1, e1] When the time point at which cluster transition occurs between different clusters is τi (i = 1, 2, ..., I) (where τ1 = s1), and the cluster after the transition at time τi is c(τi). , by arranging them in chronological order, a cluster transition sequence c(τ1)→c(τ2)→...→c(τI) of length I is obtained. In addition, by assigning the stay period d (τi) = τi + 1 - τi (however, τI + 1 = e1) in the cluster c (τi) to each cluster c (τi) constituting this cluster transition series, the stay period A cluster transition sequence c(τ1)|d(τ1)→c(τ2)|d(τ2)→...→c(τI)|d(τI) is obtained.
 ステップS16:次に、クラスタ遷移テンソル算出部16は、ステップS15で作成されたクラスタ遷移系列から、あらかじめ設定された固定長Lのクラスタ遷移を抽出した上で、過去期間と現在期間における各クラスタ遷移パターンの出現確率を算出し、上記クラスタ遷移長Lを階数(次元)とし、過去期間と現在期間に出現した全てのクラスタのユニーク値を各次元のインデックスに持ち、クラスタ遷移パターンの出現確率を値に持つクラスタ遷移テンソルを過去期間と現在期間のそれぞれについて算出する。具体的に過去期間[s1,e1]の時系列データ{Ys1,・・・,Ye1}から得られた長さIのクラスタ遷移系列c(τ1)→c(τ2)→・・・→c(τI)を例にとり説明すると、このクラスタ遷移系列から長さL(ただし、L≦I)のクラスタ遷移は(I-(L-1))本抽出することができ、c(τi-(L-1))→c(τi-(L-2))→・・・→c(τi)(i=L,・・・,I)で表される。クラスタ遷移テンソル算出部16は、この(I-(L-1))本のクラスタ遷移をパターンごとにまとめて出現確率を算出し、これに基づきL次元のクラスタ遷移テンソルを算出する。ここで、クラスタ遷移パターンの出現確率とは、当該クラスタ遷移パターンの出現度数を全てのクラスタ遷移パターンの出現度数合計で割った値である。 Step S16: Next, the cluster transition tensor calculation unit 16 extracts cluster transitions with a preset fixed length L from the cluster transition sequence created in step S15, and then Calculate the appearance probability of the pattern, take the cluster transition length L above as the rank (dimension), have the unique values of all clusters that appeared in the past period and the current period as indexes for each dimension, and calculate the appearance probability of the cluster transition pattern as the value. The cluster transition tensor with is calculated for each of the past period and the current period. Specifically, the cluster transition sequence c(τ1)→c(τ2)→・・・→c( Taking τI) as an example, (I-(L-1)) cluster transitions of length L (L≦I) can be extracted from this cluster transition sequence, and c(τi-(L- 1))→c(τi-(L-2))→...→c(τi) (i=L,...,I). The cluster transition tensor calculating unit 16 calculates the appearance probability of these (I−(L−1)) cluster transitions for each pattern, and calculates an L-dimensional cluster transition tensor based on this. Here, the appearance probability of a cluster transition pattern is a value obtained by dividing the frequency of appearance of the cluster transition pattern by the total frequency of appearance of all cluster transition patterns.
 なお、クラスタ遷移パターンの出現度数は、当該クラスタ遷移パターンの滞在期間で重みづけした値を用いても良い。以下、クラスタ遷移パターンの出現確率をテンソルに格納する方法について、簡単のため、L=2、過去期間と現在期間を通じて出現した全てのクラスタのユニーク値がα、β、γだった場合の例で説明する。このとき、クラスタ遷移テンソルは2次元で、各次元のインデックスは3つの値α、β、γをとる。クラスタ遷移テンソルは3×3の配列により表すことができ、クラスタ遷移パターンα→βの出現確率が0.1だった場合、第1軸のインデックス(クラスタ遷移パターンの第1要素)が値α、第2軸のインデックス(クラスタ遷移パターンの第2要素)が値βをとる配列要素に出現確率0.1を格納する。 Note that the frequency of appearance of a cluster transition pattern may be a value weighted by the stay period of the cluster transition pattern. Below, we will explain how to store the probability of occurrence of a cluster transition pattern in a tensor. For simplicity, we will use an example where L = 2 and the unique values of all clusters that have appeared in the past period and the current period are α, β, and γ. explain. At this time, the cluster transition tensor is two-dimensional, and the index of each dimension takes three values α, β, and γ. The cluster transition tensor can be represented by a 3 × 3 array, and if the probability of appearance of the cluster transition pattern α → β is 0.1, the index of the first axis (the first element of the cluster transition pattern) is the value α, An appearance probability of 0.1 is stored in an array element whose index on the second axis (second element of the cluster transition pattern) takes the value β.
 ステップS17:次に、変化点スコア算出部17は、ステップS16で算出された過去期間と現在期間それぞれのクラスタ遷移テンソルに基づき、過去期間から現在期間にかけての変化度として、過去期間におけるクラスタ遷移テンソルと現在期間におけるクラスタ遷移テンソルの距離を算出する。過去期間のクラスタ遷移テンソルD1の要素をd1i1,・・・,iL、現在期間のクラスタ遷移テンソルD2の要素をd2i1,・・・,iLとした場合、両者間の距離はたとえば以下の平均平方二乗誤差により表すことができる。
(Σl=1LΣm=1M(d2i1,・・・,iL-d1i1,・・・,iL)2/ML)1/2
 なお、下記テンソル間距離におけるMは、過去期間と現在期間を通じて出現した全てのクラスタのユニーク値の個数である。
Step S17: Next, the change point score calculation unit 17 calculates the cluster transition tensor in the past period as the degree of change from the past period to the current period based on the cluster transition tensors for the past period and the current period calculated in step S16. and calculate the distance of the cluster transition tensor in the current period. If the elements of the cluster transition tensor D1 in the past period are d1i1,..., iL, and the elements of the cluster transition tensor D2 in the current period are d2i1,..., iL, the distance between them is, for example, the following mean square It can be expressed by an error.
(Σl=1LΣm=1M(d2i1,...,iL-d1i1,...,iL)2/ML)1/2
Note that M in the inter-tensor distance below is the number of unique values of all clusters that have appeared throughout the past period and the current period.
 ステップS18:次に、検知部18は、ステップS17で算出された変化点スコアがあらかじめ設定された閾値を超えた場合に変化点として検知する。即ち、検知部18は、各時点におけるデータ(変換データ)に基づいて算出されたシステム状態の変化点スコアがあらかじめ設定された閾値を超えた場合に変化点として検知する。 Step S18: Next, the detection unit 18 detects the change point score as a change point when the change point score calculated in step S17 exceeds a preset threshold. That is, the detection unit 18 detects the change point as a change point when the change point score of the system state calculated based on the data (converted data) at each point in time exceeds a preset threshold.
 ステップS19:最後に、出力部19は、ステップS18で検知された変化点を出力する。 Step S19: Finally, the output unit 19 outputs the change point detected in step S18.
 〔ハードウェア構成〕
 続いて、図3を用いて、変化点検知装置10のハードウェア構成について説明する。図3は、変化点検知装置のハードウェア構成図である。
[Hardware configuration]
Next, the hardware configuration of the change point detection device 10 will be described using FIG. 3. FIG. 3 is a hardware configuration diagram of the change point detection device.
 図3に示されているように、変化点検知装置10は、プロセッサ101、メモリ102、補助記憶装置103、接続装置104、通信装置105、ドライブ装置106を有する。なお、変化点検知装置10を構成する各ハードウェアは、バス107を介して相互に接続される。 As shown in FIG. 3, the change point detection device 10 includes a processor 101, a memory 102, an auxiliary storage device 103, a connection device 104, a communication device 105, and a drive device 106. Note that each piece of hardware that constitutes the change point detection device 10 is interconnected via a bus 107.
 プロセッサ101は、変化点検知装置10全体の制御を行う制御部の役割を果たし、CPU(Central Processing Unit)等の各種演算デバイスを有する。プロセッサ101は、各種プログラムをメモリ102上に読み出して実行する。なお、プロセッサ101には、GPGPU(General-purpose computing on graphics processing units)が含まれていてもよい。 The processor 101 plays the role of a control unit that controls the entire change point detection device 10, and includes various calculation devices such as a CPU (Central Processing Unit). The processor 101 reads various programs onto the memory 102 and executes them. Note that the processor 101 may include GPGPU (General-purpose computing on graphics processing units).
 メモリ102は、ROM(Read Only Memory)、RAM(Random Access Memory)等の主記憶デバイスを有する。プロセッサ101とメモリ102とは、いわゆるコンピュータを形成し、プロセッサ101が、メモリ102上に読み出した各種プログラムを実行することで、当該コンピュータは各種機能を実現する。 The memory 102 includes main storage devices such as ROM (Read Only Memory) and RAM (Random Access Memory). The processor 101 and the memory 102 form a so-called computer, and when the processor 101 executes various programs read onto the memory 102, the computer realizes various functions.
 補助記憶装置103は、各種プログラムや、各種プログラムがプロセッサ101によって実行される際に用いられる各種情報を格納する。 The auxiliary storage device 103 stores various programs and various information used when the various programs are executed by the processor 101.
 接続装置104は、外部装置(例えば、表示装置110、操作装置111)と変化点検知装置10とを接続する接続デバイスである。 The connection device 104 is a connection device that connects an external device (for example, the display device 110, the operation device 111) and the change point detection device 10.
 通信装置105は、他の装置との間で各種情報を送受信するための通信デバイスである。 The communication device 105 is a communication device for transmitting and receiving various information to and from other devices.
 ドライブ装置106は記録媒体130をセットするためのデバイスである。ここでいう記録媒体130には、CD-ROM(Compact Disc Read-Only Memory)、フレキシブルディスク、光磁気ディスク等のように情報を光学的、電気的あるいは磁気的に記録する媒体が含まれる。また、記録媒体130には、ROM(Read Only Memory)、フラッシュメモリ等のように情報を電気的に記録する半導体メモリ等が含まれていてもよい。 The drive device 106 is a device for setting the recording medium 130. The recording medium 130 herein includes a medium that records information optically, electrically, or magnetically, such as a CD-ROM (Compact Disc Read-Only Memory), a flexible disk, and a magneto-optical disk. Further, the recording medium 130 may include a semiconductor memory that electrically records information, such as a ROM (Read Only Memory) or a flash memory.
 なお、補助記憶装置103にインストールされる各種プログラムは、例えば、配布された記録媒体130がドライブ装置106にセットされ、当該記録媒体130に記録された各種プログラムがドライブ装置106により読み出されることでインストールされる。あるいは、補助記憶装置103にインストールされる各種プログラムは、通信装置105を介してネットワークからダウンロードされることで、インストールされてもよい。 Note that the various programs to be installed in the auxiliary storage device 103 are installed by, for example, setting the distributed recording medium 130 in the drive device 106 and reading out the various programs recorded on the recording medium 130 by the drive device 106. be done. Alternatively, various programs installed in the auxiliary storage device 103 may be installed by being downloaded from a network via the communication device 105.
 〔変化点検知装置による主な効果〕
 以上のように、変化点検知装置10は、1つ又は複数の装置で構成されるシステム(S)の各時点におけるシステム状態を表す時系列データを用いて、システム状態に何らかの変化が発生した際に、その発生時点を変化点として検知することができる。
[Main effects of the change point detection device]
As described above, the change point detection device 10 uses time-series data representing the system state at each point in time of the system (S) composed of one or more devices, and detects when some change occurs in the system state. It is possible to detect the point of occurrence as a point of change.
 しかも、変化点検知装置10は、各時点におけるシステム状態をクラスタリング手法により状態分類する方法を前提としていることで、周期変動を示すなど定常性制約やiid制約を満たさないデータも含めた時系列データを対象とすることができる。さらに、変化点検知装置10は、時間経過に伴うシステム(S)の状態遷移(つまり、各時点でシステム状態が属するクラスタとその滞在期間の遷移)を考慮することでシステム(S)の周期変動をモデル化しており、周期変動の変化など時間変化パターンの変化も含めた変化を検知することができる。 Moreover, since the change point detection device 10 is based on a method of classifying the system state at each point in time using a clustering method, time-series data including data that does not satisfy the stationarity constraint or the IID constraint, such as data that shows periodic fluctuations, can be targeted. Furthermore, the change point detection device 10 detects periodic fluctuations in the system (S) by considering state transitions of the system (S) over time (that is, transitions in the cluster to which the system state belongs and its stay period at each point in time). It is possible to detect changes, including changes in time-varying patterns, such as changes in periodic fluctuations.
 なお、後述の変化点スコア算出装置20は、変化点検知装置10と同様のハードウェア構成であるため、その説明を省略する。 Note that the change point score calculation device 20, which will be described later, has the same hardware configuration as the change point detection device 10, so a description thereof will be omitted.
 ●本実施形態の変化点スコア算出装置
 続いて、本発明の一実施形態について説明する。本実施形態では、変化点検知装置10が、変化点スコア算出部17において過去期間と現在期間のクラスタ遷移テンソル間距離を計算する際に、過去期間と現在期間におけるクラスタ遷移パターンごとの滞在確率の二乗誤差を各クラスタ遷移パターンのその他パターンからの距離で重みづけすることで、過去観測されたものとはかけ離れたクラスタ遷移パターンが新たに観測された場合や、現在観測されているものとはかけ離れたクラスタ遷移パターンが過去観測されていた場合に、変化点スコアの上昇を促す仕組みを導入し、より精緻な変化点検知を実現することが可能な変化点スコア算出装置20について説明する。
●Change point score calculation device according to the present embodiment Next, an embodiment of the present invention will be described. In this embodiment, when the change point detection device 10 calculates the distance between cluster transition tensors in the past period and the current period in the change point score calculation unit 17, the change point detection device 10 calculates the stay probability for each cluster transition pattern in the past period and the current period. By weighting the squared error by the distance of each cluster transition pattern from other patterns, we can calculate whether a cluster transition pattern is newly observed that is far from what has been observed in the past, or is far from what is currently observed. A change point score calculation device 20 that is capable of realizing more precise change point detection by introducing a mechanism to encourage an increase in the change point score when a cluster transition pattern has been observed in the past will be described.
 〔機能構成〕
 まず、本実施形態に係る変化点スコア算出装置20の機能構成について、図3を参照しながら説明する。図3は、本実施形態に係る変化点スコア算出装置の機能構成の一例を示す図である。
[Functional configuration]
First, the functional configuration of the change point score calculation device 20 according to this embodiment will be described with reference to FIG. 3. FIG. 3 is a diagram showing an example of the functional configuration of the change point score calculation device according to the present embodiment.
 図3に示すように、本実施形態に係る変化点スコア算出装置20は、セントロイド座標入力部21と、クラスタ遷移パターン入力部22と、クラスタ遷移テンソル入力部23と、セントロイド間距離行列算出部24と、クラスタ遷移パターン間距離行列算出部25と、変化点スコア算出部26と、出力部27とを有する。 As shown in FIG. 3, the change point score calculation device 20 according to the present embodiment includes a centroid coordinate input section 21, a cluster transition pattern input section 22, a cluster transition tensor input section 23, and an inter-centroid distance matrix calculation. , a cluster transition pattern distance matrix calculation section 25 , a change point score calculation section 26 , and an output section 27 .
 セントロイド座標入力部21は、クラスタリング部14により生成された全てのクラスタ(過去期間と現在期間の時系列データを構成する各時点の装置数×項目数×時間窓長の次元のデータに対し割り当てられた全てのクラスタ)のセントロイド(クラスタ中心)座標を入力する。 The centroid coordinate input unit 21 is used to allocate all the clusters generated by the clustering unit 14 (data in the dimensions of number of devices x number of items x time window length at each point in time that constitutes the time series data of the past period and the current period). Enter the centroid (cluster center) coordinates of all clusters that were created.
 クラスタ遷移パターン入力部22は、クラスタ遷移テンソル算出部16において抽出された全てのクラスタ遷移パターン(過去期間と現在期間に出現した全てのクラスタ遷移パターン)を入力する。 The cluster transition pattern input unit 22 inputs all cluster transition patterns (all cluster transition patterns that appeared in the past period and the current period) extracted by the cluster transition tensor calculation unit 16.
 クラスタ遷移テンソル入力部23は、クラスタ遷移テンソル算出部16において算出された過去期間と現在期間それぞれのクラスタ遷移テンソルを入力する。 The cluster transition tensor input unit 23 inputs the cluster transition tensors for each of the past period and the current period calculated by the cluster transition tensor calculation unit 16.
 セントロイド間距離行列算出部24は、セントロイド座標入力部21により入力された全てのクラスタのセントロイド座標に基づき、全てのクラスタペアについてのセントロイド間距離行列を算出し、中間出力を行う。 The inter-centroid distance matrix calculation unit 24 calculates the inter-centroid distance matrices for all cluster pairs based on the centroid coordinates of all clusters input by the centroid coordinate input unit 21, and performs intermediate output.
 クラスタ遷移パターン間距離行列算出部25は、クラスタ遷移パターン入力部22により入力された全てのクラスタ遷移パターンと、セントロイド間距離行列算出部24により算出された全てのクラスタペアについてのセントロイド間距離行列に基づき、全てのクラスタ遷移パターンペアについての距離行列を算出し、中間出力を行う。 The inter-cluster transition pattern distance matrix calculation unit 25 calculates the inter-centroid distances for all cluster transition patterns input by the cluster transition pattern input unit 22 and all cluster pairs calculated by the inter-centroid distance matrix calculation unit 24. Based on the matrix, distance matrices for all cluster transition pattern pairs are calculated and intermediate output is performed.
 変化点スコア算出部26は、クラスタ遷移テンソル入力部23により入力された過去期間と現在期間それぞれのクラスタ遷移テンソルと、クラスタ遷移パターン間距離行列算出部25により算出された全てのクラスタ遷移パターンペアについての距離行列に基づき、クラスタ遷移パターン間の距離を考慮した過去期間と現在期間のクラスタ遷移テンソル間距離を算出し、中間出力を行う。なお、変化点スコア算出部26は、変化点スコア算出部17の一面の機能を示した機能部である。 The change point score calculation unit 26 calculates the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input unit 23 and all the cluster transition pattern pairs calculated by the inter-cluster transition pattern distance matrix calculation unit 25. Based on the distance matrix, the distance between cluster transition tensors for the past period and the current period is calculated, taking into account the distance between cluster transition patterns, and intermediate output is performed. Note that the change point score calculation section 26 is a functional section that shows one aspect of the function of the change point score calculation section 17.
 出力部27は、変化点スコア算出部26により算出された過去期間と現在期間のクラスタ遷移テンソル間距離を変化点スコアとして出力し、検知部18に渡す。 The output unit 27 outputs the distance between cluster transition tensors between the past period and the current period calculated by the change point score calculation unit 26 as a change point score, and passes it to the detection unit 18.
 〔変化点スコア算出処理〕
 次に、本実施形態に係る変化点スコア算出処理(手順)について、図4を参照しながら説明する。図4は、本実施形態に係る変化点スコア算出処理の一例を示すフローチャートである。
[Change point score calculation process]
Next, the change point score calculation process (procedure) according to this embodiment will be described with reference to FIG. 4. FIG. 4 is a flowchart illustrating an example of the change point score calculation process according to the present embodiment.
 ステップS21:まず、セントロイド座標入力部21は、クラスタリング部14により生成された全てのクラスタ(過去期間と現在期間の時系列データを構成する各時点の装置数×項目数×時間窓長の次元のデータに対し割り当てられた全てのクラスタ)のセントロイド(クラスタ中心)座標を入力する。すなわち、過去期間と現在期間の時系列を構成するK(装置数×項目数×時間窓長)次元の時系列データから、クラスタリング部14によりI個のクラスタc,c,...,cが生成されたとして、そのi(=1,2,...,I)番目のクラスタcのセントロイド座標(K次元ベクトル)をX=(Xi1,Xi2,...,XiK)とすれば、セントロイド座標入力部21は、I個のセントロイド座標X,X,...,Xを入力する。 Step S21: First, the centroid coordinate input unit 21 selects all the clusters generated by the clustering unit 14 (dimensions of number of devices x number of items x time window length at each point in time that constitute the time series data of the past period and the current period). Enter the centroid (cluster center) coordinates of all clusters assigned to the data. That is, the clustering unit 14 creates I clusters c 1 , c 2 , . Assuming that c I is generated, the centroid coordinates (K-dimensional vector) of the i (=1, 2, ..., I)-th cluster c i are X i = (X i1 , X i2 , ... , X iK ), the centroid coordinate input unit 21 inputs I centroid coordinates X 1 , X 2 , . . . , X I.
 ステップS22:クラスタ遷移パターン入力部22は、クラスタ遷移テンソル算出部16において抽出された全てのクラスタ遷移パターン(過去期間と現在期間に出現した全てのクラスタ遷移パターン)を入力する。すなわち、クラスタ遷移パターン入力部22は、クラスタ遷移テンソル算出部16において固定長Lのクラスタ遷移パターンがM本抽出されたとして、そのm(=1,2,...,M)番目のクラスタ遷移パターンを{cm1,cm2,...,cmL}とすれば、クラスタ遷移パターン入力部22は、M本のクラスタ遷移パターン{cm1,cm2,...,cmL}(m=1,2,...,M)を入力する。 Step S22: The cluster transition pattern input unit 22 inputs all cluster transition patterns extracted by the cluster transition tensor calculation unit 16 (all cluster transition patterns that appeared in the past period and the current period). That is, assuming that M cluster transition patterns of fixed length L are extracted in the cluster transition tensor calculation unit 16, the cluster transition pattern input unit 22 inputs the m (=1, 2, ..., M)th cluster transition pattern. If the pattern is {c m1 , c m2 , ..., c mL }, the cluster transition pattern input unit 22 inputs M cluster transition patterns {c m1 , c m2 , ..., c mL } (m =1,2,...,M).
 ステップS13:クラスタ遷移テンソル入力部23は、クラスタ遷移テンソル算出部16において算出された過去期間と現在期間それぞれのクラスタ遷移テンソルを入力する。すなわち、過去期間と現在期間のそれぞれについて算出されたクラスタ遷移テンソルをそれぞれD,Dとすれば、クラスタ遷移テンソル入力部23は、この2つのテンソルD,Dを入力する。なお、ここでのクラスタ遷移テンソルとは、D,Dともに、前記クラスタ遷移パターンの長さLを階数(次元数)とし、過去期間と現在期間に出現した全てのクラスタのユニーク値を各次元のインデックスに持ち、各次元のインデックス(クラスタ)組合せに対応した要素には各次元のインデックス(クラスタ)を次元順に並べたクラスタ遷移パターンの滞在確率が格納されているテンソルである。 Step S13: The cluster transition tensor input unit 23 inputs the cluster transition tensors for each of the past period and the current period calculated by the cluster transition tensor calculation unit 16. That is, if the cluster transition tensors calculated for the past period and the current period are respectively D 1 and D 2 , the cluster transition tensor input unit 23 inputs these two tensors D 1 and D 2 . Note that the cluster transition tensor here means that both D 1 and D 2 have the length L of the cluster transition pattern as the rank (number of dimensions), and each unique value of all clusters that appeared in the past period and the current period is It is a tensor that has indexes of dimensions, and elements corresponding to combinations of indexes (clusters) of each dimension store the stay probabilities of cluster transition patterns in which the indexes (clusters) of each dimension are arranged in order of dimension.
 ステップS24:セントロイド間距離行列算出部24は、セントロイド座標入力部21により入力された全てのクラスタのセントロイド座標に基づき、全てのクラスタペアについてのセントロイド間距離行列を算出する。すなわち、セントロイド座標入力部21により入力されたI個のクラスタc,c,...,cのセントロイド座標X,X,...,Xに基づき、全てのクラスタペア{c,c}(i=1,2,...,I,j=1,2,...,I)についてセントロイド座標ペア{X,X}間の距離d(X,X)を算出し、これを(i,j)成分に格納したI行I列のセントロイド間距離行列Mを求める。なお、セントロイド座標ペア{X,X}間の距離d(X,X)としては、次式により表されるユークリッド距離を用いても良いし、他の距離(マンハッタン距離、チェビシェフ距離、マハラノビス距離など)を用いても良い。 Step S24: The inter-centroid distance matrix calculation unit 24 calculates the inter-centroid distance matrix for all cluster pairs based on the centroid coordinates of all clusters input by the centroid coordinate input unit 21. That is , based on the centroid coordinates X 1 , The distance d c between the centroid coordinate pair {X i , X j } for the pair {c i , c j } (i=1,2,...,I, j=1,2, ... ,I) (X i , X j ) is calculated, and an I-row, I-column inter-centroid distance matrix M c is obtained by storing this in the (i, j) components. Note that as the distance d c (X i , X j ) between the centroid coordinate pair {X i , X j }, the Euclidean distance expressed by the following formula may be used, or other distances (Manhattan distance, Chebyshev distance, Mahalanobis distance, etc.) may be used.
 d(X,X)=(Σk=1 (Xjk-Xik1/2
 また、セントロイド座標ペア{X,X}間の距離d(X,X)はクラスタiとクラスタjを入れ替えても変わらないためMは対象行列である。
d c (X i , X j )=(Σ k=1 K (X jk −X ik ) 2 ) 1/2
Further, since the distance d c (X i , X j ) between the centroid coordinate pair {X i , X j } does not change even if cluster i and cluster j are swapped, M c is a symmetric matrix.
 ステップS25:クラスタ遷移パターン間距離行列算出部25は、クラスタ遷移パターン入力部22により入力された全てのクラスタ遷移パターンと、セントロイド間距離行列算出部24により算出された全てのクラスタペアについてのセントロイド間距離行列に基づき、全てのクラスタ遷移パターンペアについての距離行列を算出する。すなわち、クラスタ遷移パターン入力部22により入力された全てのクラスタ遷移パターンをπ={cm1,cm2,...,cmL}(m=1,2,...,M)、セントロイド間距離行列算出部24により算出された全てのクラスタペア{c,c}(i=1,2,...,I,j=1,2,...,I)についてのセントロイド間距離をd(X,X)とすれば、クラスタ遷移パターン間距離行列算出部25は、これらの入力に基づき、全てのクラスタ遷移パターンペア{π,π}(m=1,2,...,M,n=1,2,...,M)についてクラスタ遷移パターン間距離d(π,π)を算出し、これを(m,n)成分に格納したM行M列の距離行列Mを求める。 Step S25: The inter-cluster transition pattern distance matrix calculation unit 25 calculates the centroids for all cluster transition patterns input by the cluster transition pattern input unit 22 and all cluster pairs calculated by the inter-centroid distance matrix calculation unit 24. Distance matrices for all cluster transition pattern pairs are calculated based on the interroid distance matrix. That is, all cluster transition patterns input by the cluster transition pattern input unit 22 are calculated as π m ={c m1 , cm2 ,..., c mL } (m=1, 2,..., M), cents for all cluster pairs {c i , c j } (i=1, 2, ..., I, j=1, 2, ..., I) calculated by the interroid distance matrix calculation unit 24 If the inter-roid distance is d c (X i , X j ), the inter-cluster transition pattern distance matrix calculation unit 25 calculates all cluster transition pattern pairs {π m , π n } (m= 1, 2, ..., M, n = 1, 2, ..., M), calculate the distance d pm , π n ) between cluster transition patterns, and convert this into the (m, n) component. The stored distance matrix M p of M rows and M columns is determined.
 なお、クラスタ遷移パターンπ={cm1,cm2,...,cmL}とクラスタ遷移パターンπ={cn1,cn2,...,cnL}の距離を算出するにあたっては、例えば、バイオインフォマティクス分野の技術であるアラインメント技術を利用する。アラインメント技術とは、2本もしくは3本以上の配列間の類似性の判定に利用可能な技術であり、配列長を同じにするようにギャップ記号を挿入しながら配列間の最適な対応関係を求める最適化問題を解く技術である。ステップS25では、特に、2本の配列間の最適な対応関係を求めるペアワイズ・アラインメント技術を利用する。ペアワイズ・アラインメント技術において最適化(最大化)されるのは、2本の配列間の対応関係から算出されるスコアであり、対応づけられた文字ペアが同じ場合、違う場合、一方がギャップの場合のそれぞれに対してあらかじめスコアを設定し、その総和により算出されるものである。ステップS25では、このペアワイズ・アラインメント技術を利用して、2本のクラスタ遷移パターン間の最適な対応関係を求め、最大スコアを算出することができるため、この最大スコアを2本のクラスタ遷移パターン間の類似度と見なし、この類似度から2本のクラスタ遷移パターン間の距離を算出することができる。例えば、類似度を区間[0,1]で正規化した後、1から引くなどの方法により距離を算出することができる。なお、一般的なアラインメント技術においては、対応づけられた文字ペアが同じ場合、違う場合、一方がギャップの場合のそれぞれに対してあらかじめ定数スコアを設定することが多いが、ステップS25の場合、配列(クラスタ遷移パターン)を構成する文字がクラスタに該当するため、違う文字ペア(クラスタペア)に対して設定するスコアとしてステップS24で算出したセントロイド間距離を用いる。例えば、同じ文字ペア(クラスタペア)に対して設定するスコアを+s、一方がギャップの場合に設定するスコアを-gとした場合、セントロイド間距離行列Mの最小値が+s、最大値-gとなるよう正規化する。これは、セントロイド間距離行列Mの最小値は同じクラスタペア間の距離に該当するため、これを同じ文字ペア(クラスタペア)に対して設定したスコア+sに対応づけ、またセントロイド間距離行列Mの最大値は最も離れたクラスタペア間の距離に該当するため、これを一方がギャップの場合のスコア-gに対応づけることを意味する。 In addition, when calculating the distance between the cluster transition pattern π m ={c m1 , c m2 , ..., c mL } and the cluster transition pattern π n = {c n1 , c n2 , ..., c nL }, For example, alignment technology, which is a technology in the bioinformatics field, is used. Alignment technology is a technology that can be used to determine the similarity between two or more sequences, and it seeks the optimal correspondence between sequences by inserting gap symbols to make the sequence lengths the same. It is a technique for solving optimization problems. In step S25, in particular, a pairwise alignment technique is used to find the optimal correspondence between two sequences. What is optimized (maximized) in pairwise alignment technology is the score calculated from the correspondence between two sequences, and when the matched character pairs are the same, different, or one has a gap A score is set in advance for each, and the score is calculated by summing the scores. In step S25, by using this pairwise alignment technology, it is possible to find the optimal correspondence between the two cluster transition patterns and calculate the maximum score. The distance between two cluster transition patterns can be calculated from this similarity. For example, the distance can be calculated by normalizing the similarity in the interval [0, 1] and then subtracting it from 1. In addition, in general alignment technology, constant scores are often set in advance for each of the cases where the matched character pairs are the same, when they are different, and when one of them has a gap, but in the case of step S25, the alignment Since the characters constituting the (cluster transition pattern) correspond to clusters, the inter-centroid distance calculated in step S24 is used as the score set for different character pairs (cluster pairs). For example, if the score set for the same character pair (cluster pair) is +s and the score set when one is a gap is -g, then the minimum value of the inter-centroid distance matrix M c is +s and the maximum value - Normalize it so that it becomes g. This is because the minimum value of the inter-centroid distance matrix M c corresponds to the distance between the same cluster pair, so this is associated with the score + s set for the same character pair (cluster pair), and the inter-centroid distance Since the maximum value of the matrix M c corresponds to the distance between the farthest cluster pair, this means that it is associated with the score -g when one of the clusters is a gap.
 ステップS26:変化点スコア算出部26は、クラスタ遷移テンソル入力部23により入力された過去期間と現在期間それぞれのクラスタ遷移テンソルと、クラスタ遷移パターン間距離行列算出部25により算出された全てのクラスタ遷移パターンペアについての距離行列に基づき、クラスタ遷移パターン間の距離を考慮した過去期間と現在期間のクラスタ遷移テンソル間距離を算出する。すなわち、クラスタ遷移テンソル入力部23により入力された過去期間と現在期間のクラスタ遷移テンソルD、Dに格納されているクラスタ遷移パターンπ={cm1,cm2,...,cmL}の滞在確率をp 、p 、クラスタ遷移パターン間距離行列算出部25により算出された全てのクラスタ遷移パターンペアについての距離行列Mの行平均をとった距離ベクトルを Step S26: The change point score calculation unit 26 calculates the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input unit 23, and all cluster transitions calculated by the cluster transition pattern distance matrix calculation unit 25. Based on the distance matrix for pattern pairs, distances between cluster transition tensors in the past period and current period are calculated, taking into account the distances between cluster transition patterns. That is, the cluster transition patterns π m ={c m1 , cm2 , ..., c mL stored in the cluster transition tensors D 1 and D 2 for the past period and the current period input by the cluster transition tensor input unit 23 } are the stay probabilities p 1 m , p 2 m , and the distance vector obtained by taking the row average of the distance matrix M p for all the cluster transition pattern pairs calculated by the inter-cluster transition pattern distance matrix calculation unit 25 is
Figure JPOXMLDOC01-appb-M000001
とし、その第m要素をδとすれば、クラスタ遷移パターン間の距離を考慮した過去期間と現在期間のクラスタ遷移テンソル間距離d(D,D)は例えば次式により算出される。
Figure JPOXMLDOC01-appb-M000001
If the m-th element is δ m , the distance d (D 1 , D 2 ) between cluster transition tensors between the past period and the current period, taking into account the distance between cluster transition patterns, is calculated, for example, by the following equation.
 d(D,D)=(Σδ(p -p 1/2
 なお、ここで距離ベクトル
d(D 1 , D 2 )=(Σ m δ m (p 2 m −p 1 m ) 2 ) 1/2
In addition, here the distance vector
Figure JPOXMLDOC01-appb-M000002
の第m要素δは、クラスタ遷移パターンπのその他パターンからの平均距離を意味し、次式により算出される(Nは距離行列Mの列数であり、過去期間、現在期間を通じて出現したクラスタ遷移パターン数を意味する)。
Figure JPOXMLDOC01-appb-M000002
The m-th element δ m means the average distance of the cluster transition pattern π m from other patterns, and is calculated by the following formula (N is the number of columns of the distance matrix M (means the number of cluster transition patterns).
 δ=Σn≠m(π,π)/(N-1)
 ステップS27:出力部27は、変化点スコア算出部26により算出された過去期間と現在期間のクラスタ遷移テンソル間距離d(D,D)を変化点スコアとして出力し、検知部8に渡す。
δ mn≠m d pm , π n )/(N-1)
Step S27: The output unit 27 outputs the distance d (D 1 , D 2 ) between cluster transition tensors between the past period and the current period calculated by the change point score calculation unit 26 as a change point score, and passes it to the detection unit 8. .
 〔実施形態の主な効果〕
 以上のように、本実施形態に係る変化点スコア算出装置20は、変化点スコア算出部17において過去期間と現在期間のクラスタ遷移テンソル間距離を計算する際に、過去期間と現在期間におけるクラスタ遷移パターンごとの滞在確率の二乗誤差を各クラスタ遷移パターンのその他パターンからの距離で重みづけすることで、過去観測されたものとはかけ離れたクラスタ遷移パターンが新たに観測された場合や、現在観測されているものとはかけ離れたクラスタ遷移パターンが過去観測されていた場合に、変化点スコアの上昇を促す仕組みを導入し、より精緻な変化点検知を実現することができる。
[Main effects of the embodiment]
As described above, when the change point score calculation unit 17 calculates the distance between cluster transition tensors between the past period and the current period, the change point score calculation device 20 according to the present embodiment By weighting the squared error of the stay probability for each pattern by the distance of each cluster transition pattern from other patterns, it is possible to detect cases where a new cluster transition pattern is observed that is far from those observed in the past, or when a cluster transition pattern that is currently observed is If a cluster transition pattern that is far different from the current one has been observed in the past, a mechanism can be introduced to encourage the change point score to increase, allowing for more precise change point detection.
 〔補足〕
 本発明は上述の実施形態に限定されるものではなく、以下に示すような構成又は処理(動作)であってもよい。
〔supplement〕
The present invention is not limited to the above-described embodiments, and may have the following configuration or processing (operation).
 変化点検知装置10及び変化点スコア算出装置20は、コンピュータとプログラムによっても実現できるが、このプログラムを(非一時的)記録媒体に記録することも、インターネット等のネットワークを通して提供することも可能である。 The change point detection device 10 and the change point score calculation device 20 can be realized by a computer and a program, but this program can also be recorded on a (non-temporary) recording medium or provided through a network such as the Internet. be.
10 変化点検知装置
11 入力部
12 時間窓生成部
13 期間設定部
14 クラスタリング部
15 クラスタ遷移系列作成部
16 クラスタ遷移テンソル算出部
17 変化点スコア算出部
18 検知部
19 出力部
20 変化点スコア算出装置
21 セントロイド座標入力部
22 クラスタ遷移パターン入力部
23 クラスタ遷移テンソル入力部
24 セントロイド間距離行列算出部
25 クラスタ遷移パターン間距離行列算出部
26 変化点スコア算出部
27 出力部
10 Change point detection device 11 Input section 12 Time window generation section 13 Period setting section 14 Clustering section 15 Cluster transition series creation section 16 Cluster transition tensor calculation section 17 Change point score calculation section 18 Detection section 19 Output section 20 Change point score calculation device 21 Centroid coordinate input section 22 Cluster transition pattern input section 23 Cluster transition tensor input section 24 Inter-centroid distance matrix calculation section 25 Inter-cluster transition pattern distance matrix calculation section 26 Change point score calculation section 27 Output section

Claims (4)

  1.  クラスタリング部により生成された、過去期間と現在期間の時系列データを構成する各時点の装置数×項目数×時間窓長の次元のデータに対し割り当てられた全てのクラスタの中心であるセントロイド座標を入力するセントロイド座標入力部と、
     クラスタ遷移テンソル算出部において抽出された、過去期間と現在期間に出現した全てのクラスタ遷移パターンを入力するクラスタ遷移パターン入力部と、
     前記クラスタ遷移テンソル算出部において算出された、過去期間と現在期間それぞれのクラスタ遷移テンソルを入力するクラスタ遷移テンソル入力部と、
     前記セントロイド座標入力部により入力された全てのクラスタの前記セントロイド座標に基づき、全てのクラスタペアについてのセントロイド間距離行列を算出するセントロイド間距離行列算出部と、
     前記クラスタ遷移パターン入力部により入力された全ての前記クラスタ遷移パターンと、前記セントロイド間距離行列算出部により算出された全ての前記クラスタペアについてのセントロイド間距離行列に基づき、全ての前記クラスタ遷移パターンのペアについての距離行列を算出するクラスタ遷移パターン間距離行列算出部と、
     前記クラスタ遷移テンソル入力部により入力された過去期間と現在期間それぞれの前記クラスタ遷移テンソルと、クラスタ遷移パターン間距離行列算出部により算出された全ての前記クラスタ遷移パターンのペアについての距離行列に基づき、前記クラスタ遷移パターンの間の距離を考慮して、過去期間と現在期間の前記クラスタ遷移テンソルの間の距離を算出する変化点スコア算出部と、
     を有する変化点スコア算出装置。
    Centroid coordinates, which are the centers of all clusters, assigned to the data generated by the clustering unit in the dimensions of number of devices x number of items x time window length at each time point that make up the time series data of the past and current periods. a centroid coordinate input section for inputting
    a cluster transition pattern input unit that inputs all cluster transition patterns that appeared in the past period and the current period, extracted by the cluster transition tensor calculation unit;
    a cluster transition tensor input unit that inputs cluster transition tensors for each of the past period and the current period calculated in the cluster transition tensor calculation unit;
    an inter-centroid distance matrix calculation unit that calculates an inter-centroid distance matrix for all cluster pairs based on the centroid coordinates of all clusters input by the centroid coordinate input unit;
    All the cluster transitions are calculated based on all the cluster transition patterns input by the cluster transition pattern input unit and the inter-centroid distance matrices for all the cluster pairs calculated by the inter-centroid distance matrix calculation unit. a cluster transition pattern distance matrix calculation unit that calculates a distance matrix for a pair of patterns;
    Based on the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input unit, and the distance matrix for all pairs of cluster transition patterns calculated by the cluster transition pattern distance matrix calculation unit, a change point score calculation unit that calculates a distance between the cluster transition tensors of the past period and the current period, taking into account the distance between the cluster transition patterns;
    A change point score calculation device having:
  2.  前記クラスタ遷移パターン間距離行列算出部は、前記クラスタ遷移パターンの間の距離を算出する際にアラインメント技術を利用し、当該アラインメント技術により算出される前記クラスタ遷移パターンの間の最適な対応関係を達成するスコアを正規化した値を類似度と見なし、当該類似度に基づき前記クラスタ遷移パターンの間の距離を算出する、請求項1に記載の変化点スコア算出装置。 The inter-cluster transition pattern distance matrix calculation unit utilizes alignment technology when calculating the distance between the cluster transition patterns, and achieves an optimal correspondence between the cluster transition patterns calculated by the alignment technology. 2. The change point score calculation device according to claim 1, wherein a value obtained by normalizing the score is regarded as a degree of similarity, and a distance between the cluster transition patterns is calculated based on the degree of similarity.
  3.  クラスタリング部により生成された、過去期間と現在期間の時系列データを構成する各時点の装置数×項目数×時間窓長の次元のデータに対し割り当てられた全てのクラスタの中心であるセントロイド座標を入力するセントロイド座標入力処理と、
     クラスタ遷移テンソル算出部において抽出された、過去期間と現在期間に出現した全てのクラスタ遷移パターンを入力するクラスタ遷移パターン入力処理と、
     前記クラスタ遷移テンソル算出部において算出された、過去期間と現在期間それぞれのクラスタ遷移テンソルを入力するクラスタ遷移テンソル入力処理と、
     セントロイド座標入力処理により入力された全てのクラスタの前記セントロイド座標に基づき、全てのクラスタペアについてのセントロイド間距離行列を算出するセントロイド間距離行列算出処理と、
     クラスタ遷移パターン入力処理により入力された全ての前記クラスタ遷移パターンと、セントロイド間距離行列算出処理により算出された全ての前記クラスタペアについてのセントロイド間距離行列に基づき、全ての前記クラスタ遷移パターンのペアについての距離行列を算出するクラスタ遷移パターン間距離行列算出処理と、
     クラスタ遷移テンソル入力処理により入力された過去期間と現在期間それぞれの前記クラスタ遷移テンソルと、クラスタ遷移パターン間距離行列算出処理により算出された全ての前記クラスタ遷移パターンのペアについての距離行列に基づき、前記クラスタ遷移パターンの間の距離を考慮して過去期間と現在期間の前記クラスタ遷移テンソルの間の距離を算出する変化点スコア算出処理と、
     をコンピュータが実行する変化点スコア算出方法。
    Centroid coordinates, which are the centers of all clusters, assigned to the data generated by the clustering unit in the dimensions of number of devices x number of items x time window length at each time point that make up the time series data of the past and current periods. Centroid coordinate input processing to input
    Cluster transition pattern input processing that inputs all cluster transition patterns that appeared in the past period and the current period, extracted by the cluster transition tensor calculation unit;
    Cluster transition tensor input processing of inputting cluster transition tensors for each of the past period and the current period calculated by the cluster transition tensor calculation unit;
    An inter-centroid distance matrix calculation process that calculates an inter-centroid distance matrix for all cluster pairs based on the centroid coordinates of all clusters input by the centroid coordinate input process;
    All of the cluster transition patterns are calculated based on all the cluster transition patterns input by the cluster transition pattern input process and the inter-centroid distance matrices for all the cluster pairs calculated by the inter-centroid distance matrix calculation process. A distance matrix calculation process between cluster transition patterns that calculates a distance matrix for the pair;
    Based on the cluster transition tensors for each of the past period and the current period input by the cluster transition tensor input process and the distance matrix for all the pairs of cluster transition patterns calculated by the inter-cluster transition pattern distance matrix calculation process, a change point score calculation process that calculates the distance between the cluster transition tensors of the past period and the current period by considering the distance between the cluster transition patterns;
    A change point score calculation method performed by a computer.
  4.  コンピュータに、請求項3に記載の方法を実行させるプログラム。 A program that causes a computer to execute the method according to claim 3.
PCT/JP2022/018871 2022-04-26 2022-04-26 Change point score calculation device, change point score calculation method, and program WO2023209799A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/018871 WO2023209799A1 (en) 2022-04-26 2022-04-26 Change point score calculation device, change point score calculation method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2022/018871 WO2023209799A1 (en) 2022-04-26 2022-04-26 Change point score calculation device, change point score calculation method, and program

Publications (1)

Publication Number Publication Date
WO2023209799A1 true WO2023209799A1 (en) 2023-11-02

Family

ID=88518210

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/018871 WO2023209799A1 (en) 2022-04-26 2022-04-26 Change point score calculation device, change point score calculation method, and program

Country Status (1)

Country Link
WO (1) WO2023209799A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017068748A (en) * 2015-10-01 2017-04-06 富士通株式会社 Clustering program, clustering method, and information processing apparatus

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017068748A (en) * 2015-10-01 2017-04-06 富士通株式会社 Clustering program, clustering method, and information processing apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
SHOKO TAKAHASHI, KEI TAKESHITA: "Proposal of change detection technology using cluster transition tensor", IEICE TECHNICAL REPORT, CQ, IEICE, JP, vol. 121, no. 263 (CQ2021-75), 18 November 2021 (2021-11-18), JP, pages 49 - 54, XP009549874 *

Similar Documents

Publication Publication Date Title
Mao et al. Principal graph and structure learning based on reversed graph embedding
Imani et al. A binary learning framework for hyperdimensional computing
Chen et al. Recursive projection twin support vector machine via within-class variance minimization
Fürnkranz et al. Multilabel classification via calibrated label ranking
Ng et al. Hashing-based undersampling ensemble for imbalanced pattern classification problems
Macêdo et al. Entropic out-of-distribution detection: Seamless detection of unknown examples
Tsimperidis et al. R 2 BN: An adaptive model for keystroke-dynamics-based educational level classification
Miller et al. Critic-driven ensemble classification
Gu et al. Fuzzy style k-plane clustering
Dialameh et al. A general feature-weighting function for classification problems
Joshi et al. Multimodal biometrics: state of the art in fusion techniques
Xia et al. An efficient and accurate rough set for feature selection, classification, and knowledge representation
CN116959725A (en) Disease risk prediction method based on multi-mode data fusion
Nguyen et al. Incomplete label multiple instance multiple label learning
Han et al. Generalizing long short-term memory network for deep learning from generic data
Lu et al. Discriminative transfer learning using similarities and dissimilarities
Xu et al. Trusted-data-guided label enhancement on noisy labels
Liu et al. Toe: A grid-tagging discontinuous ner model enhanced by embedding tag/word relations and more fine-grained tags
CN111563539A (en) Domain self-adaption method based on Hilbert-Schmidt independent criterion subspace learning
Wang et al. Solution path for manifold regularized semisupervised classification
Hu Research on English achievement analysis based on improved CARMA algorithm
Ren et al. A diversified attention model for interpretable multiple clusterings
WO2023209799A1 (en) Change point score calculation device, change point score calculation method, and program
Liu et al. A weight-incorporated similarity-based clustering ensemble method
WO2023084787A1 (en) Change point detection device, change point detection method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22940079

Country of ref document: EP

Kind code of ref document: A1