JPH07219929A

JPH07219929A - Method for detecting deviated value and data processor

Info

Publication number: JPH07219929A
Application number: JP6008871A
Authority: JP
Inventors: Taichiro Ueda; 太一郎上田
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1994-01-28
Filing date: 1994-01-28
Publication date: 1995-08-18

Abstract

PURPOSE:To find out a deviated value in obtained data (yield amount, reacted amount). CONSTITUTION:Data are inputted in an input process and the inputted data are ascendingly arranged in a size judging process. Then a detected statistic amount from which the smallest value is excluded is calculated in a calculating process. The detected statistic amount is calculated from the data of combinations to be estimated by similar operation. The detected statistic amount from which no data are removed is also calculated. In a deviated value detecting process, the combination of data minimizing the detected statistic amount is found out and the removed value is regarded as a deviated value. When the detected statistic amount from which 110 data are removed is minimum, a deviated value does not exist.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】この発明は生産工程、品質管理、
研究開発、品質改良などにおけるデータの外れ値を検出
する方法及びその方法を利用した装置に関するものであ
る。BACKGROUND OF THE INVENTION The present invention relates to a production process, quality control,
The present invention relates to a method for detecting outliers in data in research and development, quality improvement, and the like, and an apparatus using the method.

【０００２】[0002]

【従来の技術】例えば、製品の性能バラ付きを測定する
場合、あるいは、電力メータや水道メータ等の検針を行
う場合、更には実験データを測定する場合に、得られた
データの中に規格外れの性能を示すデータや、異常な測
定値を示すデータが存在する。このように、規格外れの
データや、異常値は測定環境や測定装置自身から生ずる
不適切なデータであることが多い。このような不適切な
データを、ここでは以下外れ値と呼ぶことにする。外れ
値は、本来測定されるべき値ではないため、前述したよ
うな各種データから外れ値を検出し、取り除く手法が従
来から考えられてきている。データから外れ値を検出す
る方法は、従来から統計手法に基ずくものがある。外れ
値とは極端に大きなあるいは小さい値をとるデータのこ
とである。例えば、５．７１、６．５７、７．２９、
８．０６、１０．００、１５．００を考える。プロット
すると図１２のようになる。図１２を見ると１５．００
は外れ値のようである。統計手法では外れ値が１個とし
て、１個の時の外れ値を検定する計算式を用いる。２個
の時は２個用の計算式を用いる。統計的検定であるから
予め危険率（有意水準）を決めておく必要がある。危険
率としては伝統的に５％あるいは１％を用いている。危
険率５％とは、統計的検定により外れ値と判断を下す時
誤る確率が５％であることを示す。計算式に対応した５
％あるいは１％の数表があり、実データで計算した値と
数表とを比較して大ならば外れ値とする。ただし危険率
としては、５％ある。または１％あるということにな
る。2. Description of the Related Art For example, when measuring variations in the performance of products, when conducting meter readings of electric power meters, water meters, etc., and when measuring experimental data, the obtained data is out of specification. There is data indicating the performance of the, and data indicating an abnormal measurement value. As described above, out-of-specification data and abnormal values are often inappropriate data generated by the measurement environment or the measurement device itself. Such inappropriate data will be referred to as outliers hereinafter. Since an outlier is not a value that should be originally measured, a method of detecting and removing the outlier from various data as described above has been conventionally considered. There is a conventional method for detecting outliers from data based on a statistical method. Outliers are data that take extremely large or small values. For example, 5.71, 6.57, 7.29,
Consider 8.06, 10.00, 15.00. The plot is as shown in FIG. Looking at FIG. 12, 15.00
Is like an outlier. In the statistical method, one outlier is used, and a calculation formula for testing one outlier is used. When there are two, the calculation formula for two is used. Since it is a statistical test, it is necessary to determine the risk rate (significance level) in advance. Traditionally, the risk factor is 5% or 1%. The risk rate of 5% means that there is a 5% probability that an error will occur when an outlier is judged by a statistical test. 5 corresponding to the calculation formula
There is a numerical table of% or 1%, and the value calculated with actual data is compared with the numerical table, and if it is large, it is regarded as an outlier. However, the risk rate is 5%. Or it means 1%.

【０００３】このように従来の統計手法では外れ値の個
数が１個の時、２個の時、３個の時によって計算法が異
なったり、危険率（有意水準とも呼ばれる）の違い（５
％、１％等）により結論が異なる（５％の時外れ値と結
論しても１％の時は外れ値とはいえない等）問題点があ
る。また大きな値の外れ値、小さな値の外れ値により計
算法が異なる。統計手法であるから５％、１％の数表も
必要である。As described above, in the conventional statistical method, the calculation method is different depending on the number of outliers is one, two, and three, and the risk rate (also called significance level) is different (5
%, 1%, etc.), but there is a problem that the conclusion is different (even if the conclusion is that the outlier is 5%, it cannot be said that the outlier is 1%). Also, the calculation method differs depending on the outlier of large value and the outlier of small value. Since it is a statistical method, 5% and 1% tables are also necessary.

【０００４】具体的には、図１３及び図１４を用いて説
明する。図１３で１は情報処理装置、２はコンピュータ
（ＦＤＤ付）、３はディスプレイ・ユニット、４はプリ
ンタ、５はキーボード、６はフロッピーディスクであ
る。プログラム・ルーチンが記憶されたフロッピーディ
スク６をコンピュータ（ＦＤＤ付）２に挿入し、オペレ
ーション・ソフトを駆動して、情報処理装置１をスター
トさせる。フロッピーディスク６からプログラム・ルー
チンがロードされ入力待状態となる。The details will be described with reference to FIGS. 13 and 14. In FIG. 13, 1 is an information processing device, 2 is a computer (with FDD), 3 is a display unit, 4 is a printer, 5 is a keyboard, and 6 is a floppy disk. The floppy disk 6 in which the program routine is stored is inserted into the computer (with FDD) 2 and the operation software is driven to start the information processing apparatus 1. The program routine is loaded from the floppy disk 6 to enter the input waiting state.

【０００５】図１４は従来例の説明のためのフローチャ
ートである。ステップ１は、キーボード５からデータを
連続的に入力する段階である。ステップ２では、外れ値
の個数を入力し、１ならば、ステップ３、ステップ４
で、小さい値又は大きい値を外れ値とした統計量をそれ
ぞれ求める。なぜ別々に求めるかと言えば、小さい値と
大きい値では計算方式が異なるからである。また、外れ
値の個数が２個の場合は、ステップ５からステップ７
で、小さい値を２個外れ値とした場合、小さい値と大き
い値を１個ずつ外れ値とした場合、大きい値を２個外れ
値とした場合にそれぞれ別の計算方式で統計量を求め
る。FIG. 14 is a flow chart for explaining the conventional example. Step 1 is a step of continuously inputting data from the keyboard 5. In step 2, the number of outliers is input, and if it is 1, step 3 and step 4
Then, the statistic amount with the small value or the large value as the outlier is obtained. The reason why they are calculated separately is that the calculation method is different for small and large values. If the number of outliers is 2, step 5 to step 7
Then, when two small values are outliers, one small value and one large value are outliers, and two large values are outliers, the statistic is calculated by different calculation methods.

【０００６】ステップ８では、数表を見て上記ステップ
で求めた計算値と数表にある有意点の大小比較をする。
ステップ９では、有意点より計算値の方が大きい場合外
れ値と認識する。ステップ１０は、計算値の方が小さい
場合外れ値としない。ステップ１１では、結果の表示等
をする。図１４は危険率（有意水準）が５％の場合であ
るが、１％の場合なら１％の数表が必要となる。In step 8, the numerical value is compared with the calculated value calculated in the above step by looking at the numerical table.
In step 9, if the calculated value is larger than the significant point, it is recognized as an outlier. In step 10, if the calculated value is smaller, it is not regarded as an outlier. At step 11, the result is displayed. FIG. 14 shows the case where the risk rate (significance level) is 5%, but in the case of 1%, a numerical table of 1% is required.

【０００７】従来の統計手法で外れ値を検出する例を示
す。データとして、５．７１、６．５７、７．２９、８．０６、１０．０
０、１５．００とする。Ｇｒｕｂｂｓ検定量の式、数１を用いる。An example of detecting an outlier by a conventional statistical method will be shown. As data, 5.71, 6.57, 7.29, 8.06, 10.0
0 and 15.00. The Grubbs test amount formula, Equation 1, is used.

【０００８】[0008]

【数１】 [Equation 1]

【０００９】１５．００が外れ値と考えられるので、Ｔ
ｉの最大値ｍａｘＴｉ（ｉ＝１、２、・・・、ｎ）を求
めて数表に載っている値と比較する。ｍａｘＴｉ＝１．８４１となった。Ｇｒｕｂｂｓの数表、表１を見るとサンプル
数ｎ＝６の時、かつ、危険率５％の時１．８２、サンプ
ル数ｎ＝６、かつ、危険率１％の時１．９４である。よ
って、ｍａｘＴｉ＞１．８２、ｍａｘＴｉ＜１．９４である。従って、１５．００は危険率５％で外れ値とい
える。危険率１％では外れ値といえない。このように危
険率の違いにより結論が異なってくる。Since 15.00 is considered to be an outlier, T
The maximum value maxTi (i = 1, 2, ..., N) of i is obtained and compared with the values listed in the table. It became maxTi = 1.841. Looking at the Grubbs number table, Table 1, it is 1.82 when the sample number n = 6 and the risk rate is 5%, and 1.94 when the sample number n = 6 and the risk rate is 1%. Therefore, maxTi> 1.82 and maxTi <1.94. Therefore, it can be said that 15.00 is an outlier with a risk rate of 5%. A risk rate of 1% is not an outlier. In this way, the conclusion differs depending on the difference in the risk rate.

【００１０】[0010]

【表１】 [Table 1]

【００１１】（ＶｉｃＢａｒｎｅｔｔ，ＴｏｂｙＬ
ｅｗｉｓ（１９７８）：「ＯｕｔｌｉｅｒｓｉｎＳ
ｔａｔｉｓｔｉｃａｌＤａｔａ」，ＪｏｈｎＷｉｌ
ｅｙ＆Ｓｏｎｓ．ｐ．２９８から一部引用）(Vic Barnett, Toby L
Lewis (1978): "Outliers in S
“Statistical Data”, John Wil
ey & Sons. p. (Partially quoted from 298)

【００１２】次に、マスク効果の例をあげる。データと
して５．７１、６．５７、７．２９、８．０６、１４．
８０、１５．００とする。このデータではｍａｘＴｉ＝
１．２９となる。ｍａｘＴｉ＜１．８２である。従って
外れ値はないことになる。これはマスク効果といって、
上のように外れ値の候補が１４．８０と１５．００の２
つある場合、従来方式では外れ値を１つとして検定する
と必ずしも外れ値を検出しない例である。Next, an example of the mask effect will be given. 5.71, 6.57, 7.29, 8.06, 14.
80 and 15.00. In this data maxTi =
It becomes 1.29. maxTi <1.82. Therefore, there are no outliers. This is called the mask effect,
As shown above, the outlier candidates are 14.80 and 2 of 15.00.
If there are three outliers, the conventional method is an example in which the outliers are not always detected when the outliers are tested as one.

【００１３】[0013]

【発明が解決しようとする課題】以上説明したように、
従来のものでは外れ値の個数により計算方式が異なる。
また、外れ値の性格（大きい方の外れ値か小さい方の外
れ値か）により計算方式が異なる（ＶｉｃＢａｒｎｅ
ｔｔ，ＴｏｂｙＬｅｗｉｓ（１９７８）：「Ｏｕｔｌ
ｉｅｒｓｉｎＳｔａｔｉｓｔｉｃａｌＤａｔ
ａ」，ＪｏｈｎＷｉｌｅｙ＆Ｓｏｎｓには４０種
以上の計算式が載っている）という問題点があった。ま
た、計算値と数表の大小比較が必要である。また、危険
率の違い（５％、１％等）により結論が異なるという問
題点があった。また、マスク効果といって例えば外れ値
の候補が２つある場合、従来方式では外れ値を１つとし
て検定すると必ずしも外れ値を検出しないという問題点
があった。As described above,
In the conventional method, the calculation method differs depending on the number of outliers.
In addition, the calculation method differs depending on the nature of the outlier (whether it is a larger outlier or a smaller outlier) (Vic Barne
tt, Toby Lewis (1978): "Outl
iers in Statistical Dat
a ”, John Wiley & Sons has more than 40 types of calculation formulas). In addition, it is necessary to compare the calculated value with the numerical table. Further, there is a problem that the conclusion is different depending on the difference in the risk rate (5%, 1%, etc.). Further, in the case of the mask effect, for example, when there are two outlier candidates, the conventional method has a problem that the outlier is not always detected when the outlier is tested as one.

【００１４】この発明は、以上のような問題点を解決す
るためになされたものであり、従来のような数表を用い
ることなく、また、外れ値の個数や外れ値の性格により
計算方式を変える必要がない外れ値検出方法を得ること
を目的とする。また、マスク効果を回避することができ
る外れ値検出方法を得ることを目的とする。また、外れ
値を検出する場合にできるだけ計算過程が簡単で、且
つ、計算量も少なくて済む外れ値検出方法を得ることを
目的とする。更には、これらの外れ値検出方法を利用し
たデータ処理装置を提供することを目的とする。The present invention has been made in order to solve the above-mentioned problems, and a calculation method can be used without using a conventional numerical table and depending on the number of outliers and the nature of the outliers. The object is to obtain an outlier detection method that does not need to be changed. Another object is to obtain an outlier detection method that can avoid the mask effect. It is another object of the present invention to provide an outlier detection method in which the calculation process is as simple as possible when detecting outliers and the amount of calculation is small. Furthermore, it aims at providing the data processor which utilized these outlier detection methods.

【００１５】[0015]

【課題を解決するための手段】この発明に係る外れ値検
出方法は、以下の工程を有する。（ａ）Ｎ個（Ｎ≧３）の値を入力する入力工程、（ｂ）
上記入力工程により入力したＮ個の値の大小関係を判定
する大小判定工程、（ｃ）上記大小判定工程により判定
された大小関係に基づき、Ｎ個の値の組み合せ及び外れ
値の候補を除いたＮ個未満の値の組み合せを求め、求め
た組み合せに対して所定の計算式を用いて検出統計量を
算出する算出工程、（ｄ）上記算出工程により算出され
た検出等計量に基づいて、外れ値を検出する外れ検出工
程。An outlier detection method according to the present invention has the following steps. (A) Input step of inputting N (N ≧ 3) values, (b)
A magnitude determination step of determining the magnitude relationship of the N values input by the input step, (c) A combination of N values and outlier candidates are removed based on the magnitude relationship determined by the magnitude determination step. A calculation step of calculating a combination of values less than N and calculating a detection statistic for the calculated combination using a predetermined calculation formula, (d) a deviation based on the detection metric calculated by the calculation step. An outlier detection step of detecting a value.

【００１６】上記算出工程は、ｓ個以内の外れ値を検出
する場合、大小判定工程により判定された大小関係上連
続するｎ個（ｎ＝Ｎ−ｓ）以上の値の組み合せを複数作
成し、これらの組み合せを用いて検出統計量を算出する
ことを特徴とする。In the above calculation step, when detecting an outlier within s, a plurality of combinations of n (n = N-s) or more consecutive values in the magnitude relationship determined by the magnitude determination step are created. It is characterized in that the detection statistic is calculated using these combinations.

【００１７】上記外れ値検出工程は、Ｎ個未満の値の組
み合せから求めた検出統計量の中で最小のものを選択す
る最小値選択工程と、選択された最小値がＮ個の値の組
み合せから求めた検出統計量よりも小さい場合に、その
選択された最小値を算出した組み合せに含まれていなか
った値を外れ値とする外れ値判定工程を備えたことを特
徴とする。The outlier detection step is a combination of a minimum value selection step of selecting the smallest detection statistic obtained from a combination of values less than N and a selected minimum value of N values. If the detected statistic is smaller than the detection statistic obtained from the above, an outlier determination step of setting an outlier as a value not included in the selected calculated minimum value is provided.

【００１８】上記計算式は、外れ値の候補が除かれると
小さくなる傾向にある第１の項目と、外れ値の候補が除
かれると大きくなる第２の項目とを有し、上記算出工程
は、第１と第２の項目の値を算出し両者の和により検出
統計量を求めることを特徴とする。The above calculation formula has a first item which tends to become smaller when outlier candidates are removed, and a second item which becomes larger when outlier candidates are removed. , The values of the first and second items are calculated, and the detection statistic is obtained from the sum of the two.

【００１９】上記計算式は、更に、第１と第２の項目以
外に、第１と第２の項目を補正する補正項を有し、上記
算出工程は、第１と第２と第３の項目の値を算出し、３
者の和により検出統計量を求めることを特徴とする。The calculation formula further includes a correction term for correcting the first and second items in addition to the first and second items, and the calculation step includes the first, second and third items. Calculate the value of the item, 3
The feature is that the detection statistic is obtained by the sum of the persons.

【００２０】上記第１の項目は、検出統計量を求めるＮ
個未満の値の分散を用いていることを特徴とする。The first item is N for obtaining the detection statistic.
It is characterized by using a variance of less than the number of values.

【００２１】上記第２の項目は、検出統計量を求める場
合の外れ値の候補の個数を用いていることを特徴とす
る。The second item is characterized in that the number of outlier candidates for obtaining the detection statistic is used.

【００２２】上記第２の項目は、外れ値の候補の個数に
対して所定の係数を乗算したものを用いることを特徴と
する。The second item is characterized in that the number of outlier candidates is multiplied by a predetermined coefficient.

【００２３】上記計算式は、検出統計量を求めるＮ個未
満の値の分散と分散に対する係数を有しており、上記算
出工程は、分散と係数の乗算により検出統計量を求める
ことを特徴とする。The above-mentioned calculation formula has a variance of less than N values for obtaining the detection statistic and a coefficient for the variance, and the calculation step obtains the detection statistic by multiplying the variance and the coefficient. To do.

【００２４】上記計算式は、回帰分析の変数選択基準を
基礎にして作成されることを特徴とする。The above-mentioned calculation formula is characterized in that it is created on the basis of the variable selection criteria of the regression analysis.

【００２５】上記外れ値検出方法は、更に、入力工程と
大小判定工程の間に、入力した値を加工する加工工程を
備えたことを特徴とする。The above-mentioned outlier detection method is further characterized by further comprising a processing step of processing the input value between the input step and the magnitude judgment step.

【００２６】上記加工工程は、入力工程により入力され
た時間に依存する値を時間に依存しない値に加工するこ
とを特徴とする。The processing step is characterized in that the time-dependent value input by the input step is processed into a time-independent value.

【００２７】上記加工工程は、入力工程により入力され
た値からテコ比を計算することを特徴とする。The processing step is characterized in that the lever ratio is calculated from the value input in the input step.

【００２８】上記加工工程は、入力工程により入力され
た値から回帰分析モデルのデータを計算することを特徴
とする。The above-mentioned processing step is characterized in that the data of the regression analysis model is calculated from the value input in the input step.

【００２９】上記加工工程は、入力工程により入力され
た値から正準相関分析モデルのデータを計算することを
特徴とする。The processing step is characterized in that the data of the canonical correlation analysis model is calculated from the value input in the input step.

【００３０】上記加工工程は、入力工程により入力され
た値が複数のグループに分類されていて複数の要因によ
り判別分析を行う場合に、各グループの判別関数値を計
算すことを特徴とする。The above-mentioned processing step is characterized in that when the values inputted in the input step are classified into a plurality of groups and the discriminant analysis is carried out by a plurality of factors, the discriminant function value of each group is calculated.

【００３１】また、この発明に係るデータ処理装置は、
外れ値検出方法を実行して外れ値を検出する外れ値検出
手段と、Ｎ個の値を計測して外れ地検出手段に入力する
計測手段と、外れ値検出手段により検出された外れ値を
知らせる出力手段を備える。Further, the data processing device according to the present invention is
Outlier detection means for executing an outlier detection method to detect an outlier, measuring means for measuring N values and inputting them to the outlier detection means, and notifying an outlier detected by the outlier detection means. Equipped with output means.

【００３２】上記データ処理装置は、更に、外れ値検出
手段により検出された外れ値を除いた残りの値を用いて
所定の処理を実行するデータ処理手段を備えたことを特
徴とする。The above data processing apparatus is further characterized by further comprising data processing means for executing a predetermined process using the remaining values excluding the outliers detected by the outlier detection means.

【００３３】[0033]

【作用】第１の発明においては、入力工程により、Ｎ個
の値が入力されると、大小判定工程により値の大小関係
を判定し、大きい方の値又は小さい方の値のいくつかを
外れ値の候補とする。算出工程は、まずＮ個の値の組み
合せ及び外れ値の候補を除いたＮ個未満の値の組み合せ
を求め、次に求めた組み合せそれぞれに対して所定の計
算式を用いて検出統計量を算出する。外れ値検出工程
は、算出された検出統計量に基づいて外れ値を検出す
る。In the first aspect of the invention, when N values are input in the input step, the magnitude judgment step determines the magnitude relationship of the values, and some of the larger or smaller values are deviated. Use as a value candidate. In the calculation process, first, a combination of N values and a combination of less than N values excluding outlier candidates are obtained, and then a detection statistic is calculated using a predetermined calculation formula for each of the obtained combinations. To do. The outlier detection step detects an outlier based on the calculated detection statistic.

【００３４】第２の発明における算出工程は、大小判定
工程により判定された値の大小に基づき、ｎ個（ｎ＝Ｎ
−ｓ）以上の連続する値の組み合せを用いて、所定の計
算式により検出統計量を計算する。例えば、入力工程に
より５個（Ｎ＝５）が入力され、最大２個（ｓ＝２）の
外れ値を検出しようとする場合、大きい方から３個の入
力値を用いて１つの組み合せを作成する。また、大きい
方から４個の入力値を用いて別な組み合せを作成する。
また、最大値と最小値を除いた中間の値３個を用いて１
つの組み合せを作成する。また、小さい方の入力値３個
及び小さい方の入力値４個を用いてそれぞれ組み合せを
作成する。The calculation step in the second aspect of the invention is based on the magnitude of the value determined by the magnitude determination step, and n (n = N).
-S) The detection statistic is calculated by a predetermined calculation formula using a combination of the above consecutive values. For example, if 5 (N = 5) are input in the input process and a maximum of 2 (s = 2) outliers are to be detected, one combination is created using the three input values from the largest. To do. Also, another combination is created using the four input values from the largest one.
In addition, 1 using 3 intermediate values excluding the maximum and minimum values
Create a combination of two. In addition, a combination is created using three smaller input values and four smaller input values.

【００３５】第３の発明における外れ値検出工程は、ま
ず、算出工程により算出されたＮ個未満の値の組み合せ
の検出統計量の中で最小のものを選択する。次に、Ｎ個
の値の組み合せから求めた検出統計量と、選択された最
小値を比較し最小値の方が小さい場合、その選択された
最小値を算出した組み合せに含まれていなかった値を外
れ値とする。また、最小値の方が大きい場合、外れ値は
無しと判定する。In the outlier detection step according to the third aspect of the invention, first, the smallest detection statistic of the combination of less than N values calculated in the calculation step is selected. Next, the detection statistic obtained from the combination of N values is compared with the selected minimum value, and if the minimum value is smaller, the value that was not included in the combination that calculated the selected minimum value. Is an outlier. If the minimum value is larger, it is determined that there is no outlier.

【００３６】第４の発明における計算式は、外れ値の候
補が除かれると小さくなる傾向にある第１の項目と、外
れ値の候補が除かれる時に大きくなる第２の項目を有
し、両者の和により検出統計量を求める。この計算式に
より、外れ値がある場合最も外れた値が除かれると検出
統計量が最小となる。The calculation formula in the fourth invention has a first item that tends to become smaller when outlier candidates are removed, and a second item that becomes larger when outlier candidates are removed. The detection statistic is calculated by the sum of. According to this calculation formula, when there is an outlier, the detection statistic becomes the minimum when the most outlier is removed.

【００３７】第５の発明における計算式は、上記第１と
第２の項目に加えて、第３の項目を持つ。この第３の項
目は、上記第１と第２の項目を補正する補正項目であ
る。第１項目、第２項目、第３項目を加算して検出統計
量を求める。The calculation formula in the fifth invention has a third item in addition to the first and second items. The third item is a correction item that corrects the first and second items. The detection statistic is obtained by adding the first item, the second item, and the third item.

【００３８】第６の発明における計算式は、第１項目に
検出統計量を求めるＮ個未満の値の分散を含んでいる。
従って、最も外れた値が除かれると分散の値が小さくな
り、第１の項目の値が小さくなる。In the calculation formula in the sixth aspect of the invention, the first item includes the variance of less than N values for obtaining the detection statistic.
Therefore, when the most outlying value is removed, the value of variance becomes small and the value of the first item becomes small.

【００３９】第７の発明における計算式は、第２項目に
検出統計量を求める場合の外れ値の候補の個数を含んで
いる。従って、外れ値の数を多く検出しようとすると、
第２の項目の値が大きくなる。In the calculation formula in the seventh invention, the second item includes the number of outlier candidates for obtaining the detection statistic. Therefore, if you try to detect a large number of outliers,
The value of the second item becomes large.

【００４０】第８の発明における計算式は、第２の項目
に外れ値の候補の個数に対して所定の係数を乗算したも
のを用いる。The calculation formula in the eighth invention uses the second item obtained by multiplying the number of outlier candidates by a predetermined coefficient.

【００４１】第９の発明における計算式は、検出統計量
を求めるＮ個未満の値の分散に係数を乗算して検出統計
量を求める。The calculation formula in the ninth invention obtains the detection statistic by multiplying the coefficient of variance of less than N values for obtaining the detection statistic by a coefficient.

【００４２】第１０の発明における計算式は、回帰分析
の変数選択基準を基礎にして検出統計量を求める計算式
を作成する。As the calculation formula in the tenth aspect of the invention, a calculation formula for obtaining the detection statistic is created based on the variable selection criterion of the regression analysis.

【００４３】第１１の発明においては、加工工程によ
り、入力工程により入力された値を、検出統計量を求め
ることができるデータに変換することができるため、様
々な種類のデータを入力することができる。In the eleventh aspect of the invention, since the value input in the input step can be converted into the data for which the detection statistic can be obtained by the processing step, various kinds of data can be input. it can.

【００４４】第１２の発明においては、入力工程により
入力された値から例えば時間に比例して増加、あるい
は、減少する傾向を補正して時間に依存しない値に加工
する。そして、補正された値から検出統計量を算出し、
外れ値を求めることができる。In the twelfth aspect of the invention, the value input in the input step is corrected to a value that does not depend on time by correcting the tendency to increase or decrease, for example, in proportion to time. Then, the detection statistic is calculated from the corrected value,
Outliers can be determined.

【００４５】第１３の発明においては、１つのサンプル
に複数の特性値がある場合に、テコ比の対角要素を計算
し、計算された値を基に検出統計量を求め外れ値を求め
る。In the thirteenth invention, when one sample has a plurality of characteristic values, the diagonal element of the lever ratio is calculated, and the detection statistic is calculated based on the calculated value to determine the outlier.

【００４６】第１４の発明においては、入力された値が
回帰分析の手法を適用できる場合、回帰分析の残差を計
算し、計算された値を基に検出統計量を求め外れ値を求
める。In the fourteenth aspect, when the input value can be applied to the regression analysis method, the residual of the regression analysis is calculated, and the detection statistic is calculated based on the calculated value to determine the outlier.

【００４７】第１５の発明においては、入力された値が
正準相関分析モデルのデータの場合、正準相関分析を行
い合成変量関数を２個求め、これより合成変量関数値を
求め、合成変量関数値からテコ比を計算しテコ比の計算
された値を基に検出統計量を求め外れ値を求める。In the fifteenth invention, when the input value is the data of the canonical correlation analysis model, canonical correlation analysis is performed to obtain two synthetic variable functions, and the value of the synthetic variable function is obtained from this to obtain the synthetic variable. The lever ratio is calculated from the function value, the detection statistic is calculated based on the calculated value of the lever ratio, and the outlier is calculated.

【００４８】第１６の発明においては、入力された値が
複数のグループに分類されていて、複数の要因により判
別分析を行う場合に、判別関数値を計算し計算された値
を基に検出統計量を求め外れ値を求める。In the sixteenth aspect, when the input values are classified into a plurality of groups and the discriminant analysis is performed by a plurality of factors, the discriminant function value is calculated and the detection statistics are calculated based on the calculated values. Find the quantity and find the outlier.

【００４９】第１７の発明におけるデータ処理装置は、
計測手段によりＮ個の値を計測し、この計測された値か
ら、上記外れ値検出方法を実行する外れ値検出手段によ
り、外れ値を検出し、出力手段により外れ値を知らせ
る。A data processor according to the seventeenth invention is
The measuring means measures N values, the outlier detecting means for executing the above-mentioned outlier detecting method detects the outlier from the measured values, and the output means informs the outlier.

【００５０】第１８の発明におけるデータ処理装置は、
計測手段によりＮ個の値を計測し、この計測された値か
ら、上記外れ値検出方法を実行する外れ値検出手段によ
り、外れ値を検出し、データ処理手段により検出された
外れ値を除いた残りの値を用いて所定の処理を実行す
る。A data processor according to the eighteenth invention is
N values were measured by the measuring means, outliers were detected from the measured values by the outlier detecting means for executing the above outlier detecting method, and the outliers detected by the data processing means were removed. Predetermined processing is executed using the remaining values.

【００５１】[0051]

【実施例】【Example】

実施例１．従来例で説明した図１３を再びこの実施例の
装置を説明するための図として説明する。図１３で、１
は情報処理装置、２はコンピュータ（ＦＤＤ付）、３は
ディスプレイ・ユニット、４はプリンタ、５はキーボー
ド、６はフロッピーディスクである。この発明のハード
ウェア構成は従来例と変わらず、プログラム・ルーチン
が記憶されたフロッピーディスク６をコンピュータ（Ｆ
ＤＤ付）２に挿入し、オペレーション・ソフトを駆動し
て、情報処理装置１をスタートさせる。プログラム・ル
ーチンがロードされ、入力待状態となる。キーボード５
からデータをキー入力すれば、プログラム・ルーチンが
動作し、ディスプレイ３に処理結果を表示し、また、プ
リンタ４に処理結果をプリントすることになる。Example 1. FIG. 13 described in the conventional example will be described again as a diagram for explaining the apparatus of this embodiment. In FIG. 13, 1
Is an information processing device, 2 is a computer (with an FDD), 3 is a display unit, 4 is a printer, 5 is a keyboard, and 6 is a floppy disk. The hardware configuration of the present invention is the same as that of the conventional example, and the floppy disk 6 storing the program routine is stored in a computer (F
(With DD) 2 and drive the operation software to start the information processing apparatus 1. The program routine is loaded and is in the input wait state. Keyboard 5
When data is input from the key, the program routine is operated, the processing result is displayed on the display 3, and the processing result is printed on the printer 4.

【００５２】この実施例では、検出統計量を算出するた
めに数２を使う。In this embodiment, Equation 2 is used to calculate the detection statistic.

【００５３】[0053]

【数２】 [Equation 2]

【００５４】この統計量の値が最小になるサンプルの組
み合せを見つければよい。図１は本発明の説明のための
フローチャートである。ステップ２０は、キーボード５
からのデータを連続的に入力する入力工程である。例え
ば、ｘ₁、ｘ₂、ｘ₃、ｘ₄、ｘ₅の５つのデータを入
力する。この場合は、入力するデータの個数をＮとする
と、Ｎ＝５となる。ステップ２１は、入力されたデータ
の大小を比較し、例えば昇順にｘ₁＜ｘ₂＜ｘ₃＜ｘ₄
＜ｘ₅のように並べる。この工程は大小判定工程であ
る。It suffices to find the combination of samples that minimizes the value of this statistic. FIG. 1 is a flowchart for explaining the present invention. Step 20 is keyboard 5
Is an input process for continuously inputting data from. For example, five pieces of data x ₁ , x ₂ , x ₃ , x ₄ , x ₅ are input. In this case, if the number of input data is N, N = 5. In step 21, the size of the input data is compared and, for example, x ₁ <x ₂ <x ₃ <x _{4 in} ascending order.
<Arrange like x ₅ . This process is a size determination process.

【００５５】このようにデータを昇順に並べかえること
によって、外れ値の候補を見つけることが容易となる。
外れ値の候補の個数をｓ（ｓ≧１）とすると、外れ値の
候補は、その性質からいって一番大きい値からｓ個、一
番小さい値からｓ個、または大きい値と小さい値の両方
あわせてｓ個と考えられる。By rearranging the data in ascending order in this way, it becomes easy to find outlier candidates.
Given that the number of outlier candidates is s (s ≧ 1), the outlier candidates are, by their nature, the largest value s, the smallest value s, or the largest and smallest values. Both are considered to be s.

【００５６】ステップ２２はこれらのデータ群から、本
実施例での計算式により検出統計量Ｕｔを計算する算出
工程である。外れ値の候補の個数ｓ＝１の場合は、ま
ず、（ｘ₁、ｘ₂、ｘ₃、ｘ₄、ｘ₅）からｘ₁を除い
た時の検出統計量を計算する。これを検出統計量Ｕｔ
_(-1)とする。以下同様にｘ₅を除いた時をＵｔ_(-5)とす
る。外れ値の候補の個数ｓ＝２の場合は、ｘ₁とｘ₂を
除いた時をＵｔ_(-1,-2)とし、ｘ₄とｘ₅を除いた時を
Ｕｔ_(-4,-5)とし、ｘ₁とｘ₅を除いた時をＵｔ_(-
_1,-5)とし、小さい値または大きい値から順にサンプル
を除いて検出統計量を計算する。このように考えられる
組み合せの検出統計量をそれぞれ計算する。ここで外れ
値の候補の個数ｓは、予めシステムにより定められてい
るものとする。あるいは、外れ値の候補の個数ｓは、オ
ペレータ、あるいは、プログラムにより指定されるもの
とする。あるいは、外れ値の候補の個数は計算の度に自
由に設定することが可能なものであるとする。Step 22 is a calculation step for calculating the detection statistic Ut from these data groups by the calculation formula of this embodiment. When the number of outlier candidates s = 1, first, the detection statistic when x ₁ is removed from (x ₁ , x ₂ , x ₃ , x ₄ , x ₅ ) is calculated. This is the detection statistic Ut
_(-1) Similarly, the time when x ₅ is excluded is defined as Ut _(-5) . When the number of outlier candidates s = 2, Ut _{(-1, -2)} is obtained when x ₁ and x ₂ are removed, and Ut _{(-4, -5} ₎ when x ₄ and x ₅ are removed. ₎ And excluding x ₁ and x ₅ is Ut _(-
_{1, -5),} and the detection statistic is calculated by removing the samples in order from the smallest value or the largest value. The detection statistic of each combination thus considered is calculated. Here, it is assumed that the number s of outlier candidates is predetermined by the system. Alternatively, the number s of outlier candidates is specified by an operator or a program. Alternatively, it is assumed that the number of outlier candidates can be freely set at each calculation.

【００５７】尚、ここで与えられる外れ値の候補の個数
は、外れ値として必ず見つけなければいけない個数では
ない。ここで言う外れ値の候補の個数とは、外れ値とし
て検出する最大の個数を言う。例えば、外れ値の候補の
個数ｓ＝２の場合は、外れ値を最大２個見つける場合を
言い、外れ値の個数を必ず２個見つけるという意味では
ない。従って、外れ値の候補の個数ｓ＝２の場合は、外
れ値が０個の場合、外れ値が１個の場合、あるいは、外
れ値２個の場合というような結果が考えられる。以下同
様に外れ値の候補の個数ｓという場合は、外れ値として
検出できる数の最大値を示すものとする。このように、
この実施例及び後述する実施例においては、外れ値の数
を特定の数に設定する必要はなく、外れ値の数の最大値
を指定しておけばよい。The number of outlier candidates given here is not the number that must be found as an outlier. The number of outlier candidates here means the maximum number detected as an outlier. For example, when the number of outlier candidates s = 2, it means that a maximum of two outliers are found, and does not mean that two outliers are always found. Therefore, when the number of outlier candidates s = 2, there are possible results such as 0 outliers, 1 outlier, or 2 outliers. Similarly, the number s of outlier candidates is the maximum number of outliers that can be detected. in this way,
In this embodiment and the embodiments described later, it is not necessary to set the number of outliers to a specific number, and the maximum value of the number of outliers may be designated.

【００５８】ステップ２３は、検出統計量Ｕｔの最小値
（Ｕｔｍｉｎ）を見つける段階である。ステップ２４
は、外れ値の候補を除かない時の検出統計量Ｕｔ₍₀₎と
Ｕｔｍｉｎを比較する段階である。ステップ２５は、Ｕ
ｔｍｉｎの方が小さい場合、Ｕｔｍｉｎを求めた時のデ
ータの組み合せに含まれていなかった値を外れ値とす
る。ステップ２６は、Ｕｔ₍₀₎が最小となる場合で、こ
の時、外れ値は「ない」とする。ステップ２３からステ
ップ２６までが外れ値検出工程である。ステップ２７
は、表示等をする。Step 23 is a step of finding the minimum value (Utmin) of the detection statistic Ut. Step 24
Is a step of comparing the detection statistic Ut ₍₀₎ with Utmin when the outlier candidates are not removed. Step 25 is U
When tmin is smaller, a value that was not included in the data combination when Utmin was obtained is set as an outlier. Step 26 is a case where Ut ₍₀₎ becomes the minimum, and at this time, the outlier is set to "not exist". Steps 23 to 26 are the outlier detection process. Step 27
Displays, etc.

【００５９】次に、データを使ってこのフローチャート
の流れを説明する。ステップ２０で次の５つのデータを
入力する。５．７１、６．５７、７．２９、８．０６、１３．３２ステップ２１で入力されたデータを昇順に並べかえる。
ステップ２２で数２を使い、Ｕｔの値を計算する。例え
ば、一番小さな値５．７１を除いた時は、サンプル数ｎ
＝４、外れ値の候補の個数ｓ＝１であるので、検出統計
量をＵｔ_(-1)と表すと、Ｕｔ_(-1)＝５．９０８となる。
一番大きな値１３．３２を除いた時の検出統計量は、サ
ンプル数ｎ＝４、外れ値の候補の個数ｓ＝１であるの
で、Ｕｔ_(-5)と表すと、Ｕｔ_(-5)＝１．４４０となる。
５．７１と１３．３２をともに除いた時は、サンプル数
ｎ＝３、外れ値の候補の個数ｓ＝２であるので、Ｕｔ
_(-1,-5)と表すと、Ｕｔ_(-1,-5)＝２．５０９となる。
また、Ｕｔ_(-1,-2)、Ｕｔ_(-4,-5)、Ｕｔ_(-1,-2,-5)、
Ｕｔ_(-1,-4,-5)を計算すると（即ち、外れ値の候補の個
数ｓ＝３の検出統計量Ｕｔ₍₀₎を計算すると）表２のよ
うになる。Next, the flow of this flowchart will be described using data. In step 20, the following five data are input. 5.71, 6.57, 7.29, 8.06, 13.32 The data input in step 21 is rearranged in ascending order.
In step 22, the value of Ut is calculated using the equation 2. For example, when the smallest value of 5.71 is excluded, the number of samples n
= 4, since it is the number s = 1 candidate outlier, if the detection statistic represents the Ut _(-1), a Ut _(-1) = 5.908.
Detection statistics when excluding the largest value 13.32, the sample number n = 4, since it is the number s = 1 candidate outlier, expressed as _{_{Ut (-5), Ut (-5}} ) = 1.440.
When both 5.71 and 13.32 are removed, the number of samples n = 3 and the number of outlier candidates s = 2.
_When expressed as _{(-1, -5)} , Ut _{(-1, -5)} = 2.509.
Also, Ut _{(-1, -2)} , Ut _{(-4, -5)} , Ut _{(-1, -2, -5)} ,
Calculation of Ut _{(-1, -4, -5)} (that is, calculation of the detection statistic Ut ₍₀₎ for the number of outlier candidates s = 3 ₎ is shown in Table 2.

【００６０】[0060]

【表２】 [Table 2]

【００６１】ステップ２３で以上で求めたＵｔの最小値
Ｕｔｍｉｎを求めると、Ｕｔｍｉｎ＝Ｕｔ_(-5)＝１．４
４０である。ステップ２４で外れ値の候補を除かない時
の検出統計量Ｕｔ₍₀₎＝４．９３０を計算し、Ｕｔｍｉ
ｎとＵｔ₍₀₎を比較する。すると、Ｕｔ_(-5)＞Ｕｔ₍₀₎
は成立しないので、ステップ２５へ行き、Ｕｔ_(-5)の時
のデータの組み合せを外れ値とする。即ち、１３．３２
が外れ値とわかる。ステップ２７で、外れ値１３．３２
の表示等出力を行う。When the minimum value Utmin of Ut obtained above is obtained in step 23, Utmin = Ut _(-5) = 1.4
40. In step 24, the detection statistic Ut ₍₀₎ = 4.930 when the outlier candidates are not removed is calculated, and Utmi
Compare n with Ut ₍₀₎ . Then Ut _(-5) > Ut ₍₀₎
Is not established, the process goes to step 25 and the combination of data at Ut _(-5) is set as an outlier. That is, 13.32.
Is an outlier. In step 27, the outlier is 13.32.
Is displayed and output.

【００６２】このように外れ値の候補の個数が３の場合
であっても、検出された外れ値の個数は１つであり、外
れ値の候補の個数以内の範囲で外れ値を検出することが
できる。As described above, even when the number of outlier candidates is 3, the number of detected outliers is one, and the outlier is detected within the range of the number of outlier candidates. You can

【００６３】この例で示した５．７１、６．５７、７．
２９、８．０６、１３．３２のデータの場合、１３．３
２を外れ値とするのは、竹内啓（１９８０）「現象と
行動の中の統計数理」新曜社でも同様の結果となってい
る。5.71, 6.57, 7. shown in this example.
For the data of 29, 8.06, 13.32, 13.3
Using an outlier of 2 is similar to that of Takeuchi (1980), "Statistical Mathematics in Phenomena and Behavior," Shinsyusha.

【００６４】この実施例の数２に示した検出統計量の式Ｕｔ＝ｎｌｏｇσ＋２ｓは、ＡＩＣ（ＡＫＡＩＫＥ’Ｓｉｎｆｏｒｍａｔｉｏ
ｎｃｒｉｔｅｒｉｏｎ）のアナロジーから考えられ
た。ｎはサンプル数、ｓは外れ値の候補の個数、σ²は
分散、σは標準偏差である。この式の第１項は、外れ値
の候補であるサンプルが除かれると小さくなる傾向があ
る。というのは、分散σ²は外れ値を除くと小さくなる
からである。また、サンプル数ｎも外れ値の候補として
除く数が増えると、ｎ＝（データ数Ｎ）−（外れ値の候
補の個数ｓ）であるから、例えば、ｎは５から４、５か
ら３というように小さな値になるからである。第２項
は、サンプル数が多くなると増加する。従って、外れ値
を除いた時、第１項と第２項の和Ｕｔは最小になると考
えることができる。The detection statistic equation Ut = nlogσ + 2s shown in the equation 2 of this embodiment is AIC (AKAIKE'S informationatio).
n criterion). n is the number of samples, s is the number of outlier candidates, σ ² is the variance, and σ is the standard deviation. The first term of this equation tends to be smaller when the samples that are candidates for outliers are removed. This is because the variance σ ² becomes small when outliers are excluded. Further, when the number of samples n to be excluded as outlier candidates increases, n = (number of data N) − (number of outlier candidates s), and therefore, n is 5 to 4, 5 to 3, for example. Because it becomes a small value. The second term increases as the number of samples increases. Therefore, it can be considered that the sum Ut of the first term and the second term becomes the minimum when the outliers are removed.

【００６５】次に、この関係をＧｒｕｂｂｓのデータ１
を使って述べる。Ｇｒｕｂｂｓのデータ１を次に示す。２．０２、２．２２、３．０４、３．２３、３．５９、
３．７３、３．９４、４．０５、４．１１、４．１３総データ数は１０個である。このデータからｌｏｇσを
計算した値を表３に、ｎｌｏｇσを計算した値を表４に
示し、Ｕｔ＝ｎｌｏｇσ＋２ｓを計算した値を表５に示
す。Next, this relationship is expressed as Grubbs data 1
To describe. Data 1 of Grubbs is shown below. 2.02, 2.22, 3.04, 3.23, 3.59,
3.73, 3.94, 4.05, 4.11, 4.13 The total number of data is 10. The values obtained by calculating log σ from this data are shown in Table 3, the values obtained by calculating n log σ are shown in Table 4, and the values obtained by calculating Ut = n log σ + 2 s are shown in Table 5.

【００６６】[0066]

【表３】 [Table 3]

【００６７】[0067]

【表４】 [Table 4]

【００６８】[0068]

【表５】 [Table 5]

【００６９】これらをグラフにしたものが図２である。
図２のｘ軸は、外れ値の候補の個数ｓであり、ｙ軸はＵ
ｔの値である。ｓに対応するＵｔの値が複数ある場合
は、その中の最小のものをプロットした。例えば、表３
において外れ値の候補の個数ｓ＝１の場合、Ｕｔは−
０．３１８と−０．５１４となるが、−０．５１４を用
いてプロットした。図２の一点鎖線で示したグラフ
（１）は、Ｕｔ＝ｌｏｇσとした場合を示している。点
線で示したグラフ（２）は、Ｕｔ＝ｎｌｏｇσとした場
合を示している。実線で示したグラフ（３）は、Ｕｔ＝
２ｓとした場合である。太線で表したグラフ（４）は、
ｎｌｏｇσと２ｓを加算したＵｔ＝ｎｌｏｇσ＋２ｓの
値である。A graph of these is shown in FIG.
The x-axis of FIG. 2 is the number of outlier candidates s, and the y-axis is U.
It is the value of t. When there were multiple Ut values corresponding to s, the smallest one was plotted. For example, Table 3
If the number of outlier candidates in s = 1, then Ut is −
It is 0.318 and -0.514, but it was plotted using -0.514. The graph (1) shown by the alternate long and short dash line in FIG. 2 shows the case where Ut = logσ. The graph (2) shown by the dotted line shows the case where Ut = nlogσ. The graph (3) shown by the solid line shows that Ut =
This is the case when it is set to 2 s. The thick line graph (4) is
It is a value of Ut = nlogσ + 2s obtained by adding nlogσ and 2s.

【００７０】前述したように外れ値の候補の個数ｓが増
加するに従って、第１項のｎｌｏｇσは減少することが
グラフ（２）よりわかる。そして、第２項の２ｓは増加
することが、グラフ（３）よりわかる。グラフ（４）に
示すｎｌｏｇσ＋２ｓの値は、ｓ＝２で最小値を取った
後増加している。Ｕｔ＝ｎｌｏｇσ＋２ｓの場合、最小
値は１ヶである。表５からわかるように、グラフ（４）
が最小となるのは、外れ値の候補の個数ｓ＝２であっ
て、その外れ値として２．０２と２．２２を仮定した場
合である。これは、Ｋｉｔａｇａｗａの方法でも同じ結
果を得ている（ＧｅｎｓｈｉｒｏＫｉｔａｇａｗａ（１
９７９）：“ＯｎｔｈｅＵｓｅｏｆＡｉｃｆ
ｏｒｔｈｅＤｅｔｅｃｔｉｏｎｏｆＯｕｔｌｉ
ｅｒｓ”，Ｔｅｃｈｎｏｍｅｔｒｉｃｓ，Ｖｏｌ．２
１，Ｎｏ．２）。As described above, it is understood from the graph (2) that nlogσ of the first term decreases as the number s of outlier candidates increases. It can be seen from graph (3) that the second term, 2s, increases. The value of nlogσ + 2s shown in the graph (4) increases after taking the minimum value at s = 2. In the case of Ut = nlogσ + 2s, the minimum value is 1. As can be seen from Table 5, graph (4)
Is the minimum when the number of outlier candidates s = 2 and 2.02 and 2.22 are assumed as the outliers. The same result was obtained by the method of Kitagawa (GenshiroKitagawa (1
979): "On the Use of Aic f.
or the Detection of Outli
ers ”, Technometrics, Vol. 2
1, No. 2).

【００７１】次に、Ｕｔ＝ｎｌｏｇσ＋２ｓとｓの関係
をもう１つ別の例で述べる。データとして、 −１．４０、−０．４４、−０．３０、−０．２４、−
０．２２、−０．１５、−０．１３、０．０６、０．１
０、０．１８、０．２０、０．４８、０．６３、１．０
１とする。これにより得られたＵｔ＝ｎｌｏｇσ＋２ｓの
値を表６に示す。Next, the relationship between Ut = nlogσ + 2s and s will be described with another example. As data, -1.40, -0.44, -0.30, -0.24,-
0.22, -0.15, -0.13, 0.06, 0.1
0, 0.18, 0.20, 0.48, 0.63, 1.0
Set to 1. Table 6 shows the value of Ut = nlogσ + 2s thus obtained.

【００７２】[0072]

【表６】 [Table 6]

【００７３】表６を基に、グラフを書くと図３のように
なる。図３のｘ軸は外れ値の候補の個数ｓ、ｙ軸はＵｔ
＝ｎｌｏｇσ＋２ｓの値である。図３からわかるよう
に、この場合もＵｔの値が最小になった後、ｓが大きく
なるにつれ、Ｕｔの値も大きくなっている。また、外れ
値の性格より、外れ値の数は総データ数に比して小さな
数であると考えられる。よって、以後検出統計量の計算
結果を表に示す場合、最小の前後のデータのみを示すこ
とにする。この外れ値の性格を用いることにより、外れ
値の候補の数を予め指定することなく、外れ値を検出す
ることも可能である。前述したように、外れ値の数が大
きくなるにつれて検出統計量の値も大きくなる。従っ
て、外れ値の候補の数を指定しない場合には、外れ値の
候補の数が少ない順に検出統計量を算出し、順に外れ値
の候補の数を増やして検出統計量を算出し、その計算し
た検出統計量が次第に大きくなる場合には、その計算を
終了させる。このようにして、外れ値の候補の数が予め
指定されていない場合であっても、外れ値を検出するこ
とが可能になる。従って、前述したように外れ値の候補
の数を予め指定する場合以外に、外れ値の候補の数をシ
ステムやプログラムにより指定せずに、検出統計量の計
算結果を比較していくことにより、その計算結果が次第
に大きくなることが判明した時点で検出統計量の計算を
停止させることにより、外れ値を検出することが可能に
なる。次に、検出統計量Ｕｔが有効であるかどうか検証
するために、従来の計算方法による結果と比較したもの
を実施例２から実施例６で述べる。A graph based on Table 6 is shown in FIG. In FIG. 3, the x-axis is the number of outlier candidates s, and the y-axis is Ut
= Nlog [sigma] + 2s. As can be seen from FIG. 3, in this case as well, after the value of Ut becomes the minimum, the value of Ut also increases as s increases. Moreover, it is considered that the number of outliers is smaller than the total number of data because of the nature of the outliers. Therefore, when the calculation result of the detection statistic is shown in the table, only the data before and after the minimum is shown. By using this outlier character, it is possible to detect outliers without specifying the number of outlier candidates in advance. As described above, the value of the detection statistic also increases as the number of outliers increases. Therefore, if you do not specify the number of outlier candidates, the detection statistic is calculated in the order of the smallest number of outlier candidates, and the detection statistic is calculated by increasing the number of outlier candidates in order, and the calculation is performed. If the detected detection statistic gradually increases, the calculation is ended. In this way, it is possible to detect outliers even when the number of outlier candidates is not specified in advance. Therefore, except when the number of outlier candidates is designated in advance as described above, by comparing the calculation results of the detection statistics without designating the number of outlier candidates by the system or program, The outlier can be detected by stopping the calculation of the detection statistic when it is found that the calculation result gradually increases. Next, in order to verify whether or not the detection statistic Ut is valid, the comparison with the result by the conventional calculation method will be described in Examples 2 to 6.

【００７４】実施例２．この実施例では、Ｇｒｕｂｂｓ
のデータ２を用い、検出統計量の式としてＵｔを用いた
場合の外れ値について述べる。Ｇｒｕｂｂｓのデータ
は、全て次の文献より引用している。 “Ｐｒｏｃｅｄｕｒｅｓｆｏｒｄｅｔｅｄｔｉｎｇ
ｏｕｔｙｉｎｇＯｂｓｅｒｖａｔｉｏｎｓｉｎ
ｓａｍｐｌｅｓ”，Ｔｅｃｈｎｏｍｅｔｒｉｃｓ，
Ｖｏｌ．１１，１−２１Ｇｒｕｂｂｓのデータ２は次の値である（データ数は１
２）。０．７４５、１．８３２、１．８５６、１．８８４、
１．９１４、１．９１６、１．９４７、１．９４９、
２．０１３、２．０２３、２．０４５、２．３２７原典では、３回の観測値とその平均値が載っているが、
ここでは平均値のみを昇順に載せる。検出統計量の計算
結果を表７に示す。Example 2. In this example, Grubbs
The outlier in the case where Ut is used as the expression of the detection statistic will be described using the data 2 of 1. All Grubbs data are quoted from the following references: "Procedures for gettingting
outing Observations in
samples ”, Technometrics,
Vol. The data 2 of 11, 1-21 Grubbs is the following value (the number of data is 1
2). 0.745, 1.832, 1.856, 1.884,
1.914, 1.916, 1.947, 1.949,
2.013, 2.023, 2.045, 2.327 In the original text, the observed values of three times and their average values are listed.
Only average values are listed here in ascending order. Table 7 shows the calculation results of the detection statistics.

【００７５】[0075]

【表７】 [Table 7]

【００７６】表７よりＵｔが最小値をとるのは、０．７
４５と２．３２７を外れ値とした場合である。前述した
Ｋｉｔａｇａｗａの方法も同じ結果となっている。From Table 7, the minimum value of Ut is 0.7.
This is a case where 45 and 2.327 are outliers. The above-mentioned Kitagawa method has the same result.

【００７７】実施例３．この実施例は、Ｇｒｕｂｂｓの
データ３を用いた場合について述べる。Ｇｒｕｂｂｓの
データ３は次の値である（データ数は１０）。５６８、５７０、５７０、５７０、５７２、５７２、５
７２、５７８、５８４、５９６検出統計量の計算結果を表８に示す。Example 3. In this example, the case of using Grubbs data 3 will be described. The data 3 of Grubbs has the following values (the number of data is 10). 568, 570, 570, 570, 572, 572, 5
72, 578, 584, 596 Table 8 shows the calculation results of the detection statistics.

【００７８】[0078]

【表８】 [Table 8]

【００７９】表８よりＵｔが最小値をとるのは、５８
４、５９６を外れ値とした場合である。Ｇｒｕｂｂｓに
よると５９６を外れ値としている。ＤａｌｌａｓＥ．
Ｊｏｈｎｓｏｎ他は、５８４、５９６を外れ値とし
た。これは、数２で求めた場合と同じである。なお、以
後ＤａｌｌａｓＥ．Ｊｏｈｎｓｏｎ他という場合は、
次の資料に基づくものとする。ＤａｌｌａｓＥ．Ｊｏｈｎｓｏｎ，Ｓｔｅｐｈｅ
ｎＡ．ＭｃＧｕｉｒｅ，ａｎｄＧｅｒｏｇｅ
Ａ．Ｍｉｌｌｉｋｅｎ（１９７８）：“Ｅｓｔｉｍａ
ｔｉｎｇ σ² ｉｎｔｈｅＰｒｅｓｅｎｃｅｏ
ｆＯｕｔｏｌｉｅｒｓ”，Ｔｅｃｈｎｏｍｅｔｒｉ
ｃｓ，Ｖｏｌ．２０，Ｎｏ．４According to Table 8, Ut has the minimum value of 58.
This is a case where 4, 596 are outliers. According to Grubbs, 596 is an outlier. Dallas E.
Johnson et al. Set 584 and 596 as outliers. This is the same as the case of obtaining by the equation 2. In addition, hereafter, Dallas E. When it comes to Johnson and others,
It shall be based on the following materials. Dallas E. Johnson, Stephe
n A. McGuire, and Geology
A. Millliken (1978): "Estima
toning σ ² in the Presence o
f Outliers ”, Technometri
cs, Vol. 20, No. Four

【００８０】実施例４．更に、実施例３のデータで、同
じデータを重複させてサイズを２倍にしたものを用いた
場合を次に示す。データ４は次の値である（データ数は
２０）。５６８、５６８、５７０、５７０、５７０、５７０、５
７０、５７０、５７２、５７２、５７２、５７２、５７
２、５７２、５７８、５７８、５８４、５８４、５９
６、５９６Example 4. Further, the case where the same data as the data of Example 3 and having the doubled size is used is shown below. Data 4 has the following values (the number of data is 20). 568, 568, 570, 570, 570, 570, 5
70, 570, 572, 572, 572, 572, 57
2,572,578,578,584,584,59
6,596

【００８１】この実施例では、データのサイズを２倍に
したので、外れ値の候補の個数を４（ｓ＝４）とする場
合について説明する。外れ値の候補の個数が４の場合
は、以下のような組み合せに対して検出統計量を算出す
ることになる。即ち、外れ値の候補の個数が１（ｓ＝
１）の場合の統計検出量と、外れ値の候補の個数が２
（ｓ＝２）の場合の検出統計量と、外れ値の候補の個数
３（ｓ＝３）の場合の検出統計量と、外れ値の候補の個
数が４（ｓ＝４）の場合の検出統計量を求める必要があ
る。外れ値の候補の個数ｓに対応する検出統計量は、以
下に示すとおりである。ｓ＝１Ｕｔ_(-1) Ｕｔ_(-20) ｓ＝２Ｕｔ_(-1,-2) Ｕｔ_(-1,-20) Ｕｔ_(-19,-20) ｓ＝３Ｕｔ_(-1,-2,-3) Ｕｔ_(-1,-2,-20) Ｕｔ_(-1,-19,-20) Ｕｔ_{(-18,-19,-20)} ｓ＝４Ｕｔ_{(-1,-2,-3,-4)} Ｕｔ_{(-1,-2,-3,-20)} Ｕｔ_{(-1,-2,-19,-20)} Ｕｔ_{(-1,-18,-19,-20)} Ｕｔ_{(-17,-18,-19,-20)} In this embodiment, since the data size is doubled, a case will be described in which the number of outlier candidates is 4 (s = 4). When the number of outlier candidates is 4, the detection statistic is calculated for the following combinations. That is, the number of outlier candidates is 1 (s =
In the case of 1), the statistical detection amount and the number of outlier candidates are 2
Detection statistic for (s = 2), detection statistic for 3 outlier candidates (s = 3), and detection for 4 outlier candidates (s = 4) It is necessary to obtain statistics. The detection statistics corresponding to the number s of outlier candidates are as shown below. s = 1 Ut _(-1) Ut _(-20) s = 2 Ut _{(-1, -2)} Ut _{(-1, -20)} Ut _{(-19, -20)} s = 3 Ut _{(-1, -2 , -3)} Ut _{(-1, -2, -20)} Ut _{(-1, -19, -20)} Ut _{(-18, -19, -20)} s = 4 Ut _{(-1, -2, -3) , -4)} Ut _{(-1, -2, -3, -20)} Ut _{(-1, -2, -19, -20)} Ut _{(-1, -18, -19, -20)} Ut _{(-17 , -18, -19, -20)}

【００８２】外れ値の候補の個数が４の場合においても
図１に示したフローチャート同様の順に外れ値を検出す
ることが可能である。異なる点は、図１におけるステッ
プ２２において前述したようなｓ＝１からｓ＝４までの
それぞれの検出統計量を算出する点である。このように
して、算出された検出統計量Ｕｔの計算結果を表９に示
す。Even when the number of outlier candidates is 4, it is possible to detect outliers in the same order as in the flowchart shown in FIG. The difference is that the detection statistics for each of s = 1 to s = 4 as described above are calculated in step 22 in FIG. Table 9 shows the calculation result of the detection statistic Ut thus calculated.

【００８３】[0083]

【表９】 [Table 9]

【００８４】表９より外れ値は、５８４、５８４、５９
６、５９６である。ＤａｌｌａｓＥ．Ｊｏｈｎｓｏｎ
他も同様の結論となっている。From Table 9, the outliers are 584, 584 and 59.
6, 596. Dallas E. Johnson
Others have similar conclusions.

【００８５】尚、外れ値の候補の個数は、入力されたデ
ータの数に基づいて常識的な範囲で任意に設定できるも
のである。例えば、入力されたデータの数が５（Ｎ＝
５）である場合に、外れ値の候補の個数は１又は２（ｓ
＝１又は２）とするのが常識的な範囲である。また、入
力されたデータの数が多くなれば外れ値の候補の個数も
多くする分には差し支えない。このように外れ値の候補
の個数は、入力されたデータの数、あるいは、そのシス
テムにおいて、どの位の精度を要求しているかというシ
ステムの要求に応じて判断されるべきものである。前述
した実施例、あるいは、後述する実施例においては、外
れ値の数を何個と推定するかという判断は、予めシステ
ムにより定められているか、あるいは、オペレータやプ
ログラムにより任意に指定できるものとする。The number of outlier candidates can be arbitrarily set within a common sense range based on the number of input data. For example, the number of input data is 5 (N =
5), the number of outlier candidates is 1 or 2 (s
= 1 or 2) is a common sense range. Also, as the number of input data increases, the number of outlier candidates may increase. As described above, the number of outlier candidates should be determined according to the number of input data or the system requirement such as how much accuracy is required in the system. In the embodiment described above or in the embodiment described below, the determination as to how many outliers should be estimated is predetermined by the system or can be arbitrarily designated by an operator or a program. .

【００８６】実施例５．次に、Ｒｏｓｎｅｒのデータを
用いた例について述べる。この例は、サイズが５４と比
較的大きく、外れ値も多く存在すると考えられるケース
である。次に、データを示す。 −０．２５、０．６８、０．９４、１．１５、１．２
０、１．２６、１．２６、１．３４、１．３８、１．４
３、１．４９、１．４９、１．５５、１．５６、１．５
８、１．６５、１．６９、１．７０、１．７６、１．７
７、１．８１、１．９１、１．９４、１．９６、１．９
９、２．０６、２．０９、２．１０、２．１４、２．１
５、２．２３、２．２４、２．２６、２．３５、２．３
７、２．４０、２．４７、２．５４、２．６２、２．６
４、２．９０、２．９２、２．９２、２．９３、３．２
１、３．２６、３．３０、３．５９、３．６８、４．３
０、４．６４、５．３４、５．４２、６．０１この、Ｒｏｓｎｅｒのデータは次の文献からとった。ＢｅｒｎａｒｄＲｏｓｎｅｒ（１９７７）：“Ｐｅｒ
ｃｅｎｔａｇｅＰｏｉｎｔｆｏｒａＧｅｎｅｒ
ａｌｉｚｅｄＥＳＤＭａｎｙ−ＯｕｔｌｉｅｒＰ
ｒｏｃｅｄｕｒｅ”，Ｔｅｃｈｎｏｍｅｔｒｉｃｓ，
Ｖｏｌ．２５，Ｎｏ．２次に、Ｕｔの計算結果を表１０に示す。Example 5. Next, an example using the data of Rosner will be described. This example is a case where the size is relatively large as 54 and there are many outliers. Next, the data is shown. -0.25, 0.68, 0.94, 1.15, 1.2
0, 1.26, 1.26, 1.34, 1.38, 1.4
3, 1.49, 1.49, 1.55, 1.56, 1.5
8, 1.65, 1.69, 1.70, 1.76, 1.7
7, 1.81, 1.91, 1.94, 1.96, 1.9
9, 2.06, 2.09, 2.10, 2.14, 2.1
5, 2.23, 2.24, 2.26, 2.35, 2.3
7, 2.40, 2.47, 2.54, 2.62, 2.6
4, 2.90, 2.92, 2.92, 2.93, 3.2
1, 3.26, 3.30, 3.59, 3.68, 4.3
0, 4.64, 5.34, 5.42, 6.01 The data of Rosner was taken from the following documents. Bernard Rosner (1977): "Per
center point for a generator
aligned ESD Many-Outlier P
location ”, Technology,
Vol. 25, No. 2 Next, Table 10 shows the calculation result of Ut.

【００８７】[0087]

【表１０】 [Table 10]

【００８８】Ｒｏｓｎｅｒは、外れ値が最大１０あると
仮定して検定した。危険率５％で、５．３４、５．４
２、６．０１を外れ値とした。表１０は、この実施例に
よる検出統計量の計算結果を示す表である。前述したよ
うに検出統計量の計算結果を表に示す場合には、最小の
値の前後のデータのみを示してある。この表１０からわ
かるように、計算式Ｕｔを用いた場合、外れ値は４．３
０、４．６５、５．３４、５．４２、６．０１である。
この場合には、外れ値として５つの外れ値が検出されて
いるが、Ｒｏｓｎｅｒが仮定したように外れ値が最大１
０個あると仮定した場合であっても、あるいは、外れ値
の候補の数を指定せずに外れ値の候補の数を増やす毎に
計算された検出統計量を比較することにより自動的に外
れ値を検出した場合のいずれの場合においても、結果は
この５つの外れ値を検出する。この実施例においては、
５つの外れ値を検出したが、もしこの方法で外れ値が３
個までとすると、Ｒｏｓｎｅｒと一致している。Rosner was calibrated assuming there were a maximum of 10 outliers. Danger rate 5%, 5.34, 5.4
Outliers were 2, 6.01. Table 10 is a table showing the calculation results of the detection statistics according to this example. When the calculation result of the detection statistic is shown in the table as described above, only the data before and after the minimum value is shown. As can be seen from Table 10, when the calculation formula Ut is used, the outlier is 4.3.
0, 4.65, 5.34, 5.42 and 6.01.
In this case, five outliers are detected as outliers, but as Rosner assumed, the maximum outlier is 1
Even if it is assumed that there are 0, or if the number of outlier candidates is not specified and the number of outlier candidates is increased, the outliers are automatically compared by comparing the calculated detection statistics. In any case where a value is detected, the result detects these five outliers. In this example,
5 outliers were detected, but if this method
If it is up to the number, it agrees with Rosner.

【００８９】実施例６．正規乱数、指数乱数、一様乱数
をサンプルデータとした場合について述べる。正規乱
数、指数乱数は、外れ値が現れる可能性があるが、一様
乱数からは外れ値は出て欲しくない。正規乱数データは
（ｎ＝１０、〜Ｎ（０、１））より、 −２．６６６、−１．２７２、−０．０４２、０．１４
０、０．２７３、０．４１５、０．４６７、１．１６
０、１．６７２、１．６７３である。Ｕｔの計算結果を表１１に示す。Example 6. Described below is the case where normal random numbers, exponential random numbers, and uniform random numbers are used as sample data. Outliers may appear in normal random numbers and exponential random numbers, but we do not want outliers to appear in uniform random numbers. The normal random number data is (n = 10, -N (0,1)), and is -2.666, -1.272, -0.042, 0.14.
0, 0.273, 0.415, 0.467, 1.16
0, 1.672 and 1.673. Table 11 shows the calculation result of Ut.

【００９０】[0090]

【表１１】 [Table 11]

【００９１】表１１より外れ値は−２．６６６、−１．
２７２である。From Table 11, the outliers are -2.666, -1.
272.

【００９２】次に、指数乱数を用いた場合について示
す。データは、竹内「現象と行動の中の統計数理」（新
曜社）からとった。０．００３、０．０２１、０．１６１、０．１７８、
０．１８０、０．２１０、０．２４９、０．４１３、
０．４９４、０．５６２、０．６１３、０．８７９、
０．９８１、１．０５９、１．１３１、１．２６４、
２．３６７、３．６６９、３．８２６、４．１９３総データ数は２０である。Ｕｔの計算結果を表１２に示
す。Next, the case of using exponential random numbers will be described. The data was taken from Takeuchi "Statistical Mathematics in Phenomena and Behavior" (Shinyusha). 0.003, 0.021, 0.161, 0.178,
0.180, 0.210, 0.249, 0.413,
0.494, 0.562, 0.613, 0.879,
0.981, 1.059, 1.131, 1.264,
2.367, 3.669, 3.826, 4.193 The total number of data is 20. Table 12 shows the calculation result of Ut.

【００９３】[0093]

【表１２】 [Table 12]

【００９４】表１２より外れ値は、４．１９３、３．８
２６、３．６６９、２．３６７である。From Table 12, the outliers are 4.193 and 3.8.
26, 3.669 and 2.367.

【００９５】次に、［０、１］の一様乱数を用いた場合
について述べる。データ数１０でデータは次の通りであ
る。０．２８３、０．４７０、０．６４３、０．６８８、
０．９１６、０．９３０、０．９４５、０．９５３、
０．９７３、０．９９５Ｕｔの計算結果を表１３に示す。Next, the case where uniform random numbers of [0, 1] are used will be described. The number of data is 10 and the data are as follows. 0.283, 0.470, 0.643, 0.688,
0.916, 0.930, 0.945, 0.953,
Table 13 shows the calculation results of 0.973 and 0.995 Ut.

【００９６】[0096]

【表１３】 [Table 13]

【００９７】表１３より外れ値の候補がない場合が最小
となっているので、一様乱数の場合、外れ値の候補はな
い。一様乱数については、更に１ケース試みたが同様に
外れ値の候補はなかった。一様乱数という性格上、外れ
値の候補なしということは望ましい結果である。Since Table 13 shows the smallest case where there is no outlier candidate, there is no outlier candidate in the case of uniform random numbers. With regard to uniform random numbers, one more case was tried, but similarly, there were no outlier candidates. Because of the nature of uniform random numbers, it is a desirable result that there are no outlier candidates.

【００９８】実施例７．従来の技術に出ているデータに
ついて、Ｕｔを用いて検出統計量を求める。データは、５．７１、６．５７、７．２９、８．０６、１０．０
０、１５．００である。結果は表１４のようになる。Example 7. With respect to the data disclosed in the conventional technique, the detection statistic is obtained using Ut. The data are 5.71, 6.57, 7.29, 8.06, 10.0.
It is 0,15.00. The results are shown in Table 14.

【００９９】[0099]

【表１４】 [Table 14]

【０１００】表１４よりＵｔは１５．００を除いた時、
最小値となることがわかり、１５．００を外れ値とす
る。From Table 14, when Ut is excluding 15.00,
It was found that the minimum value was obtained, and 15.00 is set as the outlier.

【０１０１】実施例８．従来技術で述べたマスク効果の
データについて、数２に示した計算式を用いて検出統計
量を求める。データは、５．７１、６．５７、７．２９、８．０６、１４．８
０、１５．００である。結果は表１５のようになる。Example 8. With respect to the mask effect data described in the related art, the detection statistic is obtained by using the calculation formula shown in Expression 2. The data are 5.71, 6.57, 7.29, 8.06, 14.8.
It is 0,15.00. The results are shown in Table 15.

【０１０２】[0102]

【表１５】 [Table 15]

【０１０３】表１５よりＵｔ_(-6,-5)の時、最小値とな
ることがわかり、１５．００、１４．８０を外れ値とす
る。このように数２に示した計算式を用いれば、マスク
効果を回避することができる。From Table 15, it is found that the minimum value is obtained when Ut _{(-6, -5)} , and the outliers are 15.00 and 14.80. As described above, the mask effect can be avoided by using the calculation formula shown in Formula 2.

【０１０４】実施例９．図４は、この実施例を説明する
ための図である。この実施例のデータ処理装置は、セン
サー等の計測手段を有し、これより得られた測定値か
ら、外れ値を検出する外れ値検出手段を有す。次に、外
れ値がある場合は、これを除いたデータで平均値を求め
るデータ処理手段を有す。次に、実際の適用例について
述べる。センサーから得られる測定値を、一定時間間隔
ごとに５個測定し外れ値を検出し、これを除いた平均値
を測定値とすることを考える。ｘ₁、ｘ₂、ｘ₃、
ｘ₄、ｘ₅が、測定データとして得られる。検出統計量
Ｕｔを計算し、外れ値を求める。外れ値があれば外れ値
を除き偏りのない平均値を求めることができる。データ
は時刻ｔ１、ｔ２、ｔ３、ｔ４について表１６に示す。Example 9. FIG. 4 is a diagram for explaining this embodiment. The data processing apparatus of this embodiment has a measuring means such as a sensor and an outlier detecting means for detecting an outlier from a measured value obtained from the sensor. Next, if there is an outlier, there is a data processing means for obtaining an average value from the data excluding this. Next, an actual application example will be described. Consider that five measured values obtained from the sensor are measured at regular time intervals to detect outliers, and the average value excluding the outliers is used as the measured value. x ₁ , x ₂ , x ₃ ,
x _4, x ₅ is obtained as measurement data. The detection statistic Ut is calculated and an outlier is obtained. If there are outliers, it is possible to find an average value without bias except for outliers. The data are shown in Table 16 for times t1, t2, t3 and t4.

【０１０５】[0105]

【表１６】 [Table 16]

【０１０６】時刻ｔ１のデータについて検出統計量Ｕｔ
を求める。除かない時、Ｕｔ₍₀₎＝−０．７１１とな
る。１．０２を除いた時、Ｕｔ_(-1)＝０．９５２、３．
２３を除いた時、Ｕｔ_(-5)＝０．７０９となる。１．０
２、３．２３ともに除いた時、Ｕｔ_(-1,-5)＝２．７６
０となる。これを表１７にまとめると次のようになる。
Ｕｔ_(-1,-2)、Ｕｔ_(-4,-5)も考えられるがこの実施例
では影響がないので表に示すのを省略する。以下の表で
も同様に影響がないものは表示しないことにする。Detection statistics Ut for the data at time t1
Ask for. When not removed, Ut ₍₀₎ = -0.711. When excluding 1.02, Ut _(-1) = 0.952, 3.
When 23 is excluded, Ut _(-5) = 0.709. 1.0
When both 2, 3.23 are removed, Ut _{(-1, -5)} = 2.76
It becomes 0. This is summarized in Table 17 as follows.
Ut _{(-1, -2)} and Ut _{(-4, -5)} are also conceivable, but since they have no effect in this embodiment, they are omitted from the table. In the table below, those that have no effect are not displayed.

【０１０７】[0107]

【表１７】 [Table 17]

【０１０８】外れ値がない時のＵｔが−０．７１１と最
小である。従って外れ値はない。時刻ｔ２のデータにつ
いてＵｔを求め、表１８に示す。The Ut when there is no outlier is a minimum of -0.711. Therefore, there are no outliers. Ut is obtained for the data at time t2 and is shown in Table 18.

【０１０９】[0109]

【表１８】 [Table 18]

【０１１０】０．７４を除いた時のＵｔが−３．５２４
と最小である。従って０．７４を外れ値とする。時刻ｔ
３のデータについてＵｔを求め、表１９に示す。Ut excluding 0.74 is -3.524.
And the smallest. Therefore, 0.74 is set as the outlier. Time t
Ut was calculated | required about the data of 3, and it shows in Table 19.

【０１１１】[0111]

【表１９】 [Table 19]

【０１１２】従って、０．０８を外れ値とする。次項ｔ
４のデータについてＵｔを求め、表２０に示す。Therefore, 0.08 is set as an outlier. Next term t
Ut was calculated | required about the data of 4 and it shows in Table 20.

【０１１３】[0113]

【表２０】 [Table 20]

【０１１４】従って、３．３６を外れ値とする。更新さ
れたデータ及び平均値は、表２１のようになる。Therefore, 3.36 is set as an outlier. Table 21 shows the updated data and average values.

【０１１５】[0115]

【表２１】 [Table 21]

【０１１６】実施例１０．この実施例は、外れ値検出手
段を有し、外れ値を知らせる出力手段を有するデータ処
理装置について述べる。入力工程から表２２のようなデ
ータが得られ、このデータからＵｔを計算することによ
り外れ値を検出する。表２２のデータを図示すると図５
のようになる。Example 10. This embodiment describes a data processing apparatus having an outlier detecting means and an output means for notifying an outlier. Data as shown in Table 22 is obtained from the input process, and outliers are detected by calculating Ut from this data. The data in Table 22 is shown in FIG.
become that way.

【０１１７】[0117]

【表２２】 [Table 22]

【０１１８】このデータについて検出統計量Ｕｔを求め
ると表２３のようになる。Table 23 shows the detection statistics Ut for this data.

【０１１９】[0119]

【表２３】 [Table 23]

【０１２０】表２３より４、−３を外れ値とする。従来
は、入力工程から得られたデータをディスプレー等に図
５に示すようなグラフを表示し、人が目視によって外れ
値と思われる値をピック・アップしてから計算し、確か
めていた。本発明による装置により外れ値を自動的に検
出し、工程環境・条件に異常があったかどうかの確認を
行うことができる。From Table 23, 4 and -3 are outliers. In the past, the data obtained from the input process was displayed on a display or the like in a graph as shown in FIG. 5, and a person visually picked up an outlier, which was considered to be an outlier, and calculated and confirmed it. The apparatus according to the present invention can automatically detect an outlier and confirm whether or not there is an abnormality in the process environment / condition.

【０１２１】実施例１１．図６はこの実施例を説明する
ための図である。図６で示した外れ値検出方法は、図１
の入力工程と大小判定工程の間に、入力したデータを加
工する加工工程が追加されたものである。この実施例で
は、時間とともに増加する傾向がある特性値を出力する
装置からデータをうけとり、加工工程により時間ととも
に増加する傾向を除いたデータから、外れ値を検出する
装置について述べる。データを表２４に示す。これを図
示すると図７の様になり、データは時間とともに増加す
る傾向があることが分かる。Example 11. FIG. 6 is a diagram for explaining this embodiment. The outlier detection method shown in FIG.
A processing step for processing the input data is added between the input step and the size determination step. In this embodiment, a device that receives data from a device that outputs a characteristic value that tends to increase with time, and detects an outlier from data excluding the tendency that the characteristic value increases with time due to a machining process will be described. The data are shown in Table 24. This is illustrated in FIG. 7, and it can be seen that the data tends to increase with time.

【０１２２】[0122]

【表２４】 [Table 24]

【０１２３】そこで、最小２乗法により傾向直線を求め
ると次のようになる。ｙ＝１．９３＋０．７０ｔデータから傾向直線の値を引くことにより、下のような
データとなる。 −０．６３、０．６７、−１．０３、１．２７、０．５
７、−１．１３、−０．８３、３．４７、−２．２３、
０．０７これを図示すると、図８のようになる。この補正された
データから、Ｕｔを計算すると表２５のようになる。Then, the tendency straight line is obtained by the method of least squares as follows. By subtracting the value of the trend line from the y = 1.93 + 0.70t data, the following data is obtained. -0.63, 0.67, -1.03, 1.27, 0.5
7, -1.13, -0.83, 3.47, -2.23,
0.07 When this is illustrated, it becomes like FIG. Table 25 shows Ut calculated from the corrected data.

【０１２４】[0124]

【表２５】 [Table 25]

【０１２５】従って、補正されたデータ３．４７に対応
するｔ＝８のｙ＝１１を外れ値とする。この様に、時間
とともに増加する傾向のあるデータから、増加する傾向
を補正することにより、実施例１の外れ値検出方法を適
用することができる。Therefore, y = 11 of t = 8 corresponding to the corrected data 3.47 is set as an outlier. In this way, the outlier detection method of the first embodiment can be applied by correcting the increasing tendency from the data that tends to increase with time.

【０１２６】実施例１２．この実施例は、一つのサンプ
ルに複数の特性値がある場合に、加工工程においてテコ
比（Ｘ（Ｘ^TＸ）^-1Ｘ^T）の対角要素を計算し、これを
もとに外れ値を検出する装置について説明する。データ
および計算されたテコ比を表２６に示す。Example 12. In this example, when one sample has a plurality of characteristic values, the diagonal element of the lever ratio (X (X ^T X) ^-1 X ^T ) is calculated in the processing step, and the outlier is calculated based on the diagonal element. An apparatus for detecting the will be described. The data and the calculated lever ratio are shown in Table 26.

【０１２７】[0127]

【表２６】 [Table 26]

【０１２８】テコ比を用いて検出統計量Ｕｔを求めると
表２７のようになる。Table 27 shows the detection statistic Ut obtained using the lever ratio.

【０１２９】[0129]

【表２７】 [Table 27]

【０１３０】−２２．９９が最小値である。従って０．
９９つまりサンプルｎｏ．１４を外れ値とする。テコ比
が大きいデータは、全体に与える影響が大きいので、外
れ値か否か容易に判定できる外れ値検出装置があること
は有効である。-22.99 is the minimum value. Therefore, 0.
99, that is, sample no. Let 14 be an outlier. Since data with a large lever ratio has a large effect on the entire data, it is effective to have an outlier detection device that can easily determine whether an outlier or not.

【０１３１】実施例１３．この実施例は、入力されたデ
ータが回帰分析のモデルの場合、加工工程において回帰
分析の残差を求め、これをもとに外れ値を検出する装置
について説明する。データおよび回帰式により求めた残
差は表２８のようになる。Example 13 In this embodiment, when the input data is a regression analysis model, a device for obtaining an outlier based on the residual of the regression analysis in the processing step will be described. Table 28 shows the residuals obtained from the data and the regression equation.

【０１３２】[0132]

【表２８】 [Table 28]

【０１３３】残差を用いて検出統計量Ｕｔを求めると表
２９のようになる。Table 29 shows the detection statistics Ut obtained by using the residuals.

【０１３４】[0134]

【表２９】 [Table 29]

【０１３５】Ｕｔ７．７２３が最小である。従って残差
３．９４８つまりサンプルｎｏ．６が外れ値となる。Ut7.723 is the smallest. Therefore, the residual is 3.948, that is, the sample no. 6 is an outlier.

【０１３６】実施例１４．この実施例は、入力されたデ
ータの特性値が複数あり正準相関分析モデルを適応でき
る場合、加工工程において次に示すようにデータを加工
し、これを用いて外れ値を検出する装置について説明す
る。表３０のようなデータについて考える。Example 14. In this embodiment, when there are a plurality of characteristic values of input data and a canonical correlation analysis model can be applied, the data is processed in the processing step as shown below, and an apparatus for detecting an outlier is described. To do. Consider the data in Table 30.

【０１３７】[0137]

【表３０】 [Table 30]

【０１３８】このデータに正準相関分析を行い、ｙ１，
ｙ２の合成変量関数が２個求まる。この合成変量関数を
用いて合成変量関数値が求まる。合成変量関数値をもと
にテコ比を計算する。テコ比は次のようになる。０．２７、０．３１、０．１８、０．１２、０．１０、
０．２０、０．１９、０．３０、０．１２、０．３０、
０．６３、０．１３、０．１６このテコ比について検出統計量Ｕｔを計算すると表３１
のようになる。A canonical correlation analysis is performed on this data, and y1,
Two synthetic variable functions of y2 are obtained. The value of the composite variable function is obtained using this composite variable function. The lever ratio is calculated based on the value of the synthetic random function. The lever ratio is as follows. 0.27, 0.31, 0.18, 0.12, 0.10,
0.20, 0.19, 0.30, 0.12, 0.30,
0.63, 0.13, 0.16 When the detection statistic Ut is calculated for this lever ratio, Table 31
become that way.

【０１３９】[0139]

【表３１】 [Table 31]

【０１４０】従ってテコ比０．６３を外れ値とする。こ
れはｎｏ．１１のサンプルである。Therefore, the lever ratio of 0.63 is set as an outlier. This is no. 11 samples.

【０１４１】実施例１５．この実施例は、入力されたデ
ータが２グループに特性値が分類されていて、加工工程
において複数の要因により判別分析を行い、この加工さ
れたデータをもとに各グループでの外れ値を検出する装
置について述べる。データは表３２に示す。Example 15. In this embodiment, the input data has the characteristic values classified into two groups, and the discriminant analysis is performed by a plurality of factors in the processing process, and the outliers in each group are detected based on the processed data. The following describes the device. The data are shown in Table 32.

【０１４２】[0142]

【表３２】 [Table 32]

【０１４３】データをプロットすると図９のようにな
る。図よりグループ１では、サンプルｎｏ．１が、グル
ープ２ではサンプルｎｏ．１２が外れ値のようである。
データについて判別分析を実施し、判別関数を求める
と、ｙ＝−０．６３４＊ｘ１−０．２８１＊ｘ２となる。この判別関数を用いて判別関数値を計算すると
例えば、ｎｏ．１の場合、判別関数値＝−０．６３４＊６−０．２８１＊０＝−
３．８０となる。判別関数値を表３２の右欄に載せた。グループ
１の判別関数値についてＵｔを求めると表３３のように
なる。The data plotted is as shown in FIG. From the figure, in group 1, sample no. 1 is sample No. 1 in group 2. 12 seems to be an outlier.
When discriminant analysis is performed on the data and the discriminant function is obtained, y = −0.634 * x1−0.281 * x2. When the discriminant function value is calculated using this discriminant function, for example, no. In the case of 1, the discriminant function value = -0.634 * 6-0.281 * 0 =-
It becomes 3.80. The discriminant function values are shown in the right column of Table 32. Table 33 shows Ut for the discriminant function value of group 1.

【０１４４】[0144]

【表３３】 [Table 33]

【０１４５】従って判別関数値−３．８０、サンプルｎ
ｏ．１を外れ値とする。グループ２の判別関数値につい
てＵｔを求めると表３４のようになる。Therefore, the discriminant function value is −3.80, sample n
o. Let 1 be an outlier. Table 34 shows Ut for the discriminant function value of group 2.

【０１４６】[0146]

【表３４】 [Table 34]

【０１４７】従って判別関数値−１．１２、サンプルｎ
ｏ．１２を外れ値とする。Therefore, the discriminant function value is −1.12, sample n
o. Let 12 be an outlier.

【０１４８】実施例１６．この実施例は、入力されたデ
ータが３グループに特性値が分類されていて、加工工程
において複数の要因により判別分析を行い、このデータ
にもとづいて各グループでの外れ値を検出する装置につ
いて述べる。データを表３５に示す。Example 16. In this embodiment, the input data is classified into three groups of characteristic values, a discriminant analysis is performed by a plurality of factors in a processing process, and an apparatus for detecting an outlier in each group is described based on this data. . The data are shown in Table 35.

【０１４９】[0149]

【表３５】 [Table 35]

【０１５０】データについて判別分析を実施し判別関数
値を求めると、表３５の右側、判別関数値の欄のように
なる。３グループあるので判別関数値は２組得られる。
一般に判別関数値は（グループ数−１）組得られる。こ
の２組の判別関数値から次の式に基づいてユークリッド
距離が求められ、これを表３５のユークリッド距離の欄
に示す。判別関数値のユークリッド距離ｄ（＝（ｆ₁ ²＋ｆ₂ ²）
^1/2）判別関数値のユークリッド距離について各グループごと
にＵｔを求めると、Ｕｔは次のようになる。グループ１
のＵｔを表３６に示す。When the discriminant analysis is performed on the data and the discriminant function value is obtained, the right side of Table 35, the column of the discriminant function value, is obtained. Since there are 3 groups, 2 sets of discriminant function values can be obtained.
In general, discriminant function values (number of groups-1) are obtained. From the two sets of discriminant function values, the Euclidean distance is obtained based on the following equation, and this is shown in the Euclidean distance column of Table 35. Euclidean distance d (= (f ₁ ² + f ₂ ² ) of the discriminant function value
^1/2 ) Obtaining Ut for each group with respect to the Euclidean distance of the discriminant function value, Ut is as follows. Group 1
Table 36 shows the Ut of each.

【０１５１】[0151]

【表３６】 [Table 36]

【０１５２】Ｕｔは、外れ値の候補がない場合が最小で
ある。従って外れ値はない。グループ２のＵｔを表３７
に示す。Ut is the smallest when there are no outlier candidates. Therefore, there are no outliers. Table 37 Ut of group 2
Shown in.

【０１５３】[0153]

【表３７】 [Table 37]

【０１５４】Ｕｔは、外れ値の候補がない場合が最小で
ある。従って外れ値はない。グループ３のＵｔを表３８
に示す。Ut is the smallest when there are no outlier candidates. Therefore, there are no outliers. Table 3 shows Ut of group 3
Shown in.

【０１５５】[0155]

【表３８】 [Table 38]

【０１５６】従ってユークリッド距離が２．４４、サン
プルｎｏ．１８を外れ値とする。なお、この実施例では
特性値が３グループの場合について述べたが、３グルー
プ以上についても同様に行える。Therefore, the Euclidean distance is 2.44 and the sample no. 18 is set as an outlier. In this embodiment, the case where the characteristic value is 3 groups has been described, but the same can be done for 3 groups or more.

【０１５７】実施例１７．この実施例では、上記実施例
で示した数２とは異なる計算式で検出統計量を求める場
合について述べる。検出統計量Ｕｔａの計算式は数３を
用いる。Example 17 In this embodiment, a case will be described in which the detection statistic is obtained by a calculation formula different from the equation 2 shown in the above embodiment. Formula 3 is used as the calculation formula of the detection statistic Uta.

【０１５８】[0158]

【数３】 [Equation 3]

【０１５９】データとして（総データ数１５）、 −１．４０、−０．４４、−０．３０、−０．２４、−
０．２２、−０．１３、−０．１５、０．０６、０．１
０、０．１８、０．２０、０．３９、０．４８、０．６
３、１．０１を用いる。計算結果は表３９のようになる。As data (total number of data 15), -1.40, -0.44, -0.30, -0.24,-.
0.22, -0.13, -0.15, 0.06, 0.1
0, 0.18, 0.20, 0.39, 0.48, 0.6
3, 1.01 is used. Table 39 shows the calculation result.

【０１６０】[0160]

【表３９】 [Table 39]

【０１６１】従って、表より１．０１、−１．４０を外
れ値とする。表３９の参考の欄を見るとわかるように、
Ｕｔ＝ｎｌｏｇσ＋２ｓを用いると−１．４０を外れ値
としているので、Ｕｔａでは１個多く外れ値を指定して
いる。ところが、他の多くのデータでは、数２の計算式
を用いた場合と同様の結果を得ている。このことより、
数３の計算式は、場合によっては１個多く外れ値を検出
するという特徴がある。尚、数３の第２項のｂ₂／２
は、「竹内」の正規分布のあてはまりのよさの補正の指
標を参考とした。竹内（竹内啓（１９７６）：“情報統計量の分布とモ
デルの適切さの基準”、「数理科学」、ＮＯ．１５３サ
イエンス社、１２−１８）によれば正規分布モデルの適
切さを表す統計量（以下竹内の統計量）Ｔ_s は、ｚ_i（
_i＝１、・・・、ｎ）をサンプル数ｎのデータ、ｚをｚ
_iの平均として次のようになる。Ｔ_s ＝−ｌｏｇσ−ｂ₂／２ｎここで、 σ²＝｛Σ（ｚ_i−ｚ）²｝／ｎｂ₂＝｛Σ（ｚ_i−ｚ）⁴｝／ｎσ⁴ この竹内の統計量の値が大きいほど適切な正規分布モデ
ルに近い。Therefore, from the table, the outliers are 1.01 and −1.40. As you can see from the reference column of Table 39,
If Ut = nlog [sigma] + 2s is used, -1.40 is set as the outlier, so one more outlier is specified in Uta. However, with many other data, the same results as when the calculation formula of Equation 2 is used are obtained. From this,
The calculation formula of Expression 3 has a feature that one more outlier is detected in some cases. Incidentally, the number 3 the second term of b _2/2
Was based on the index for correcting the goodness of fit of the normal distribution of "Takeuchi". Takeuchi (Takeuchi Kei (1976): “Distribution of Information Statistics and Appropriateness of Models”, “Mathematical Science”, No. 153 Science Co., 12-18) shows statistics indicating the suitability of a normal distribution model. The quantity (statistical quantity of Takeuchi) T _s is z _i (
_i = 1, ..., N) is the data of the sample number n, and z is z
The average of _i is as follows. T _s = −log σ−b ₂ / 2n where σ ² = {Σ (z _i −z) ² } / n b ₂ = {Σ (z _i −z) ⁴ } / nσ ⁴ The larger the value, the closer to an appropriate normal distribution model.

【０１６２】正規分布モデルａと正規分モデルｂの２つ
があり、正規分布モデルａの分散は正規分布モデルｂの
分散よりも小さな値を示す場合、正規分布モデルａの方
が正規分布モデルｂよりもデータｚ_iが平均＝０に近い
値を多く示す。上記竹内の統計量を求める式の第２項に
あるｂ₂／２ｎは、補正項と呼ばれているものであり、
第１項にあるｌｏｇσの値を補正する意味を持っている
ものである。従って、竹内の統計量は第１項にあるｌｏ
ｇσの値が大きく影響するものである。従って、分散σ
の値によってこの竹内の統計量の特徴付けがなされる。
従って、分散が小さいほど竹内の統計量の値が大きくな
り、この竹内の統計量の値が大きいほど正規分布モデル
ｂよりも正規分布モデルａに近いパターン（即ち、分散
の小さいパターン）を示すことになる。When there are two distributions, a normal distribution model a and a normal distribution model b, and the variance of the normal distribution model a shows a smaller value than the variance of the normal distribution model b, the normal distribution model a is more than the normal distribution model b. Also shows that the data z _i have many values close to the average = 0. B ₂ / 2n in the second term of the above equation for calculating the statistics of Takeuchi is called a correction term,
This has the meaning of correcting the value of log σ in the first term. Therefore, the statistics of Takeuchi are lo in the first term.
The value of gσ has a great influence. Therefore, the variance σ
The value of is used to characterize the statistics of Takeuchi.
Therefore, the smaller the variance, the larger the value of the statistic in Takeuchi, and the larger the value of the statistic in Takeuchi, the pattern closer to the normal distribution model a than the normal distribution model b (that is, the pattern with smaller variance). become.

【０１６３】実施例１８．以下、この実施例１８から実
施例４６までは、検出統計量を求めるための計算式を図
１０に示す回帰分析説明変数選択基準を基礎にして作成
している。前述した数２及び数３はＡＩＣを基礎にして
考えたものである。ＡＩＣは、回帰分析説明変数選択基
準の一例である。従って、以下の実施例１８から実施例
４６までは、ＡＩＣよる回帰分析説明変数選択基準以外
の回帰分析説明変数選択基準を基礎にして、検出統計量
を求める場合においても前述した実施例と同様な効果を
奏することができる点をについて説明する。実施例１８
から実施例４６までに示す検出統計量の計算式数４から
数３２は、図１０に示す回帰分析説明変数選択基準を基
礎にして考えられたものであり、これらの計算式は、大
きく分けて２つのタイプに分類できる。第１のグループ
は、前述した実施例までと同じ形式で、ｎｌｏｇσ＋第２項（＋第３項）である。第２のグループは、乗算タイプで、調整因子×σ である。また、データとして、 −１．４０、−０．４４、−０．３０、−０．２４、−
０．２２、−０．１５、−０．１３、０．０６、０．１
０、０．１８、０．２０、０．３９、０．４８、０．６
３、１．０１を、この実施例以後の全ての実施例で用いる。これを図
示すると図１１のようになる。同じデータに対して、計
算式の違いにより求まる外れ値の数が異なっている。よ
って、外れ値を多く出さなくてもよい適用業務と、外れ
値を多く出したい業務により、計算式を選んで使うこと
ができる。Example 18. Hereinafter, in Examples 18 to 46, the calculation formula for obtaining the detection statistic is created based on the regression analysis explanatory variable selection criterion shown in FIG. The above equations 2 and 3 are based on the AIC. AIC is an example of a regression analysis explanatory variable selection criterion. Therefore, the following Examples 18 to 46 are similar to the above-described Examples even when the detection statistic is obtained based on the regression analysis explanatory variable selection criteria other than the regression analysis explanatory variable selection criteria by the AIC. The points that can produce the effect will be described. Example 18
The numerical formulas 4 to 32 of the detection statistic shown in to Example 46 are considered based on the regression analysis explanatory variable selection criterion shown in FIG. 10, and these calculation formulas are roughly classified. It can be classified into two types. The first group has the same format as in the above-described embodiment, and is nlogσ + second term (+ third term). The second group is a multiplication type and is an adjustment factor x σ. Also, as data, -1.40, -0.44, -0.30, -0.24,-
0.22, -0.15, -0.13, 0.06, 0.1
0, 0.18, 0.20, 0.39, 0.48, 0.6
3, 1.01 is used in all examples after this example. This is illustrated in FIG. 11. For the same data, the number of outliers that can be obtained differs depending on the calculation formula. Therefore, a calculation formula can be selected and used depending on the application that does not need to generate many outliers and the job that wants to generate many outliers.

【０１６４】検出統計量Ｓｔの計算式を、数４に示す。The formula for calculating the detection statistic St is shown in Equation 4.

【０１６５】[0165]

【数４】 [Equation 4]

【０１６６】Ｓｔを計算すると表４０のようになる。Table 40 shows the calculation of St.

【０１６７】[0167]

【表４０】 [Table 40]

【０１６８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０１６９】実施例１９．検出統計量Ｆｔの計算式を、
数５に示す。Example 19. The calculation formula of the detection statistic Ft is
Shown in Equation 5.

【０１７０】[0170]

【数５】 [Equation 5]

【０１７１】Ｆｔを計算すると表４１のようになる。Table 41 shows the calculation of Ft.

【０１７２】[0172]

【表４１】 [Table 41]

【０１７３】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０１７４】実施例２０．検出統計量Ｔｔの計算式を、
数６に示す。Example 20. The calculation formula of the detection statistic Tt is
Shown in Equation 6.

【０１７５】[0175]

【数６】 [Equation 6]

【０１７６】Ｔｔを計算すると表４２のようになる。Table 42 shows the calculation of Tt.

【０１７７】[0177]

【表４２】 [Table 42]

【０１７８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０１７９】実施例２１．検出統計量ＴＩｔの計算式
を、数７に示す。Example 21. The calculation formula of the detection statistic TIt is shown in Formula 7.

【０１８０】[0180]

【数７】 [Equation 7]

【０１８１】ＴＩｔを計算すると表４３のようになる。Table 43 shows the calculation of TIt.

【０１８２】[0182]

【表４３】 [Table 43]

【０１８３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０１８４】実施例２２．検出統計量Ｗｔの計算式を、
数８示す。Example 22. The calculation formula of the detection statistic Wt is
The number 8 is shown.

【０１８５】[0185]

【数８】 [Equation 8]

【０１８６】Ｗｔを計算すると表４４のようになる。Table 44 shows the calculated Wt.

【０１８７】[0187]

【表４４】 [Table 44]

【０１８８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０１８９】実施例２３．検出統計量Ｐｔの計算式を、
数９示す。Example 23. The calculation formula of the detection statistic Pt is
The number 9 is shown.

【０１９０】[0190]

【数９】 [Equation 9]

【０１９１】Ｐｔを計算すると表４５のようになる。Calculation of Pt is shown in Table 45.

【０１９２】[0192]

【表４５】 [Table 45]

【０１９３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０１９４】実施例２４．検出統計量Ｈｔ計算式を、数
１０示す。Example 24. Equation 10 shows the detection statistic Ht calculation formula.

【０１９５】[0195]

【数１０】 [Equation 10]

【０１９６】Ｈｔを計算すると表４６のようになる。Table 46 shows Ht calculated.

【０１９７】[0197]

【表４６】 [Table 46]

【０１９８】従って、−１．４０、１．０１、０．６３
を外れ値とする。Therefore, -1.40, 1.01, 0.63
Is an outlier.

【０１９９】実施例２５．検出統計量Ｕ．１ｔの計算式
を、数１１示す。Example 25. Detection statistics U.S. Formula 11 shows the calculation formula for 1t.

【０２００】[0200]

【数１１】 [Equation 11]

【０２０１】Ｕ．１ｔを計算すると表４７のようにな
る。U. Table 1 shows 1t.

【０２０２】[0202]

【表４７】 [Table 47]

【０２０３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２０４】実施例２６．検出統計量Ｕ．９ｔの計算式
を、数１２示す。Example 26. Detection statistics U.S. Equation 12 shows the calculation formula for 9t.

【０２０５】[0205]

【数１２】 [Equation 12]

【０２０６】Ｕ．９ｔを計算すると表４８のようにな
る。U. Table 9 shows the result of calculating 9t.

【０２０７】[0207]

【表４８】 [Table 48]

【０２０８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２０９】実施例２７．検出統計量Ｕｕｔの計算式
を、数１３示す。Example 27. Equation 13 shows the calculation formula of the detection statistic Uut.

【０２１０】[0210]

【数１３】 [Equation 13]

【０２１１】Ｕｕｔを計算すると表４９のようになる。Table 49 shows the calculation of Uut.

【０２１２】[0212]

【表４９】 [Table 49]

【０２１３】従って、−１．４０、１．０１、０．６３
を外れ値とする。Therefore, -1.40, 1.01, 0.63
Is an outlier.

【０２１４】実施例２８．検出統計量Ｂｔの計算式を、
数１４示す。Example 28. The calculation formula of the detection statistic Bt is
The number 14 is shown.

【０２１５】[0215]

【数１４】 [Equation 14]

【０２１６】Ｂｔを計算すると表５０のようになる。Table 50 shows the calculation of Bt.

【０２１７】[0217]

【表５０】 [Table 50]

【０２１８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２１９】実施例２９．検出統計量Ｄｔの計算式を、
数１５示す。Example 29. The calculation formula of the detection statistic Dt is
The number 15 is shown.

【０２２０】[0220]

【数１５】 [Equation 15]

【０２２１】Ｄｔを計算すると表５１のようになる。Table 51 shows the calculation of Dt.

【０２２２】[0222]

【表５１】 [Table 51]

【０２２３】従って、−１．４０、１．０１、０．６３
を外れ値とする。Therefore, -1.40, 1.01, 0.63
Is an outlier.

【０２２４】実施例３０．検出統計量Ｇｔの計算式を、
数１６示す。Example 30. The calculation formula of the detection statistic Gt is
Expression 16 is shown.

【０２２５】[0225]

【数１６】 [Equation 16]

【０２２６】Ｇｔを計算すると表５２のようになる。Table 52 shows the calculation of Gt.

【０２２７】[0227]

【表５２】 [Table 52]

【０２２８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２２９】実施例３１．検出統計量Ｑｔの計算式を、
数１７示す。Example 31. The calculation formula of the detection statistic Qt is
Expression 17 is shown.

【０２３０】[0230]

【数１７】 [Equation 17]

【０２３１】Ｑｔを計算すると表５３のようになる。Table 53 shows the calculation of Qt.

【０２３２】[0232]

【表５３】 [Table 53]

【０２３３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２３４】実施例３２．検出統計量Ｉｔの計算式を、
数１８示す。Example 32. The calculation formula of the detection statistic It is
The number 18 is shown.

【０２３５】[0235]

【数１８】 [Equation 18]

【０２３６】Ｉｔを計算すると表５４のようになる。Table 54 shows a calculation of It.

【０２３７】[0237]

【表５４】 [Table 54]

【０２３８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as the outlier.

【０２３９】実施例３３．検出統計量Ｖｔの計算式を、
数１９示す。Example 33. The calculation formula of the detection statistic Vt is
The number 19 is shown.

【０２４０】[0240]

【数１９】 [Formula 19]

【０２４１】Ｖｔを計算すると表５５のようになる。Table 55 shows the calculation of Vt.

【０２４２】[0242]

【表５５】 [Table 55]

【０２４３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２４４】実施例３４．検出統計量Ｅｔの計算式を、
数２０示す。Example 34. The calculation formula of the detection statistic Et is
The number 20 is shown.

【０２４５】[0245]

【数２０】 [Equation 20]

【０２４６】Ｅｔを計算すると表５６のようになる。Calculation of Et results in Table 56.

【０２４７】[0247]

【表５６】 [Table 56]

【０２４８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as the outlier.

【０２４９】実施例３５．検出統計量Ｊｔの計算式を、
数２１示す。Example 35. The calculation formula of the detection statistic Jt is
Expression 21 is shown.

【０２５０】[0250]

【数２１】 [Equation 21]

【０２５１】Ｊｔを計算すると表５７のようになる。Table 57 shows the calculation of Jt.

【０２５２】[0252]

【表５７】 [Table 57]

【０２５３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２５４】実施例３６．検出統計量Ｖｔｄの計算式
を、数２２示す。Example 36. Formula 22 shows the calculation formula of the detection statistic Vtd.

【０２５５】[0255]

【数２２】 [Equation 22]

【０２５６】Ｖｔｄを計算すると表５８のようになる。Table 58 shows the calculation of Vtd.

【０２５７】[0257]

【表５８】 [Table 58]

【０２５８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２５９】実施例３７．検出統計量ＢＢｔの計算式
を、数２３示す。Example 37. Formula 23 shows the calculation formula of the detection statistic BBt.

【０２６０】[0260]

【数２３】 [Equation 23]

【０２６１】ＢＢｔを計算すると表５９のようになる。Table 59 shows the calculation of BBt.

【０２６２】[0262]

【表５９】 [Table 59]

【０２６３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２６４】実施例３８．検出統計量ＣＣｔの計算式
を、数２４示す。Example 38. Equation 24 shows the calculation formula of the detection statistic CCt.

【０２６５】[0265]

【数２４】 [Equation 24]

【０２６６】ＣＣｔを計算すると表６０のようになる。Table 60 shows the calculation of CCt.

【０２６７】[0267]

【表６０】 [Table 60]

【０２６８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２６９】実施例３９．検出統計量ＤＤｔの計算式
を、数２５示す。Example 39. Equation 25 shows the calculation formula of the detection statistic DDt.

【０２７０】[0270]

【数２５】 [Equation 25]

【０２７１】ＤＤｔを計算すると表６１のようになる。Table 61 shows the calculation of DDt.

【０２７２】[0272]

【表６１】 [Table 61]

【０２７３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０２７４】実施例４０．検出統計量ＧＧｔの計算式
を、数２６示す。Example 40. Formula 26 shows the calculation formula of the detection statistic GGt.

【０２７５】[0275]

【数２６】 [Equation 26]

【０２７６】ＧＧｔを計算すると表６２のようになる。Table 62 shows the calculation of GGt.

【０２７７】[0277]

【表６２】 [Table 62]

【０２７８】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as the outlier.

【０２７９】実施例４１．検出統計量Ｕｐｔの計算式
を、数２７示す。Example 41. Equation 27 shows the calculation formula of the detection statistic Upt.

【０２８０】[0280]

【数２７】 [Equation 27]

【０２８１】Ｕｐｔを計算すると表６３のようになる。Table 63 shows the calculation of Upt.

【０２８２】[0282]

【表６３】 [Table 63]

【０２８３】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２８４】実施例４２．検出統計量Ｚｔの計算式を、
数２８示す。Example 42. The calculation formula of the detection statistic Zt is
The number 28 is shown.

【０２８５】[0285]

【数２８】 [Equation 28]

【０２８６】Ｚｔを計算すると表６４のようになる。Table 64 shows the calculation of Zt.

【０２８７】[0287]

【表６４】 [Table 64]

【０２８８】従って、−１．４０、１．０１、０．６３
を外れ値とする。Ｚｔは外れ値を多めに検出する。Therefore, -1.40, 1.01, 0.63
Is an outlier. Zt detects an outlier too much.

【０２８９】実施例４３．検出統計量Ｋｔの計算式を、
数２９示す。Example 43. The calculation formula of the detection statistic Kt is
The number 29 is shown.

【０２９０】[0290]

【数２９】 [Equation 29]

【０２９１】Ｋｔを計算すると表６５のようになる。Table 65 shows the calculation of Kt.

【０２９２】[0292]

【表６５】 [Table 65]

【０２９３】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２９４】実施例４４．検出統計量Ｘｔの計算式を、
数３０示す。Example 44. The calculation formula of the detection statistic Xt is
The number 30 is shown.

【０２９５】[0295]

【数３０】 [Equation 30]

【０２９６】Ｘｔを計算すると表６６のようになる。Table 66 shows the result of calculating Xt.

【０２９７】[0297]

【表６６】 [Table 66]

【０２９８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０２９９】実施例４５．検出統計量ＨＱｔの計算式
を、数３１示す。Example 45. Equation 31 shows the calculation formula of the detection statistic HQt.

【０３００】[0300]

【数３１】 [Equation 31]

【０３０１】ＨＱｔを計算すると表６７のようになる。Table 67 shows the calculation of HQt.

【０３０２】[0302]

【表６７】 [Table 67]

【０３０３】従って、−１．４０を外れ値とする。Therefore, -1.40 is set as an outlier.

【０３０４】実施例４６．検出統計量ＡＩＣｔの計算式
を、数３２示す。Example 46. Equation 32 shows the calculation formula of the detection statistic AICt.

【０３０５】[0305]

【数３２】 [Equation 32]

【０３０６】ＡＩＣｔを計算すると表６８のようにな
る。Table 68 shows the calculation of AICt.

【０３０７】[0307]

【表６８】 [Table 68]

【０３０８】従って、−１．４０と１．０１を外れ値と
する。Therefore, -1.40 and 1.01 are set as outliers.

【０３０９】[0309]

【発明の効果】第１の発明によれば、値を入力すれば算
出された検出統計量に基づき外れ値が検出されるので、
従来のように計算値と数表の大小比較をする必要がな
い。また、外れ値の個数を予め設定する必要がない。あ
るいは、外れ値として検出したい数の最大値を指定して
おけばよい。また、外れ値の個数により、又は大きい方
の外れ値か、小さい方の外れ値かにより計算方式を変え
る必要もないので、計算過程が簡単でかつ計算量も少な
くてすむ。また、従来方式では、マスク効果により外れ
値を検出できないことがあったが、これを回避すること
ができるので、より正確な結果が得られる。また、外れ
値が存在しない時は、存在しないと判定する。According to the first invention, since an outlier is detected based on the calculated detection statistic by inputting a value,
There is no need to compare the calculated value with the numerical table as in the past. Further, it is not necessary to preset the number of outliers. Alternatively, the maximum value of the number to be detected as an outlier may be designated. Further, since it is not necessary to change the calculation method depending on the number of outliers, or the larger outlier or the smaller outlier, the calculation process is simple and the amount of calculation is small. Further, in the conventional method, the outlier value may not be detected due to the mask effect, but this can be avoided, so that a more accurate result can be obtained. If there is no outlier, it is determined that it does not exist.

【０３１０】第２の発明によれば、外れ値を算出するた
めの値の選択が簡単に行える。According to the second invention, it is possible to easily select a value for calculating an outlier.

【０３１１】第３の発明によれば、検出統計量の単純な
比較だけで外れ値を求めるので、処理が簡単になる。According to the third invention, since the outlier is obtained only by simple comparison of the detection statistics, the processing becomes simple.

【０３１２】第４の発明における計算式によれば、外れ
値がある場合、最も外れた値が除かれると検出統計量は
最小となるので、これを利用して外れ値を求めることが
できる。According to the calculation formula of the fourth invention, when there is an outlier, the detection statistic becomes the minimum when the most outlier is removed. Therefore, the outlier can be obtained using this.

【０３１３】第５の発明における計算式によれば、補正
項があるため、より的確に外れ値を求めることができ
る。According to the calculation formula of the fifth aspect of the invention, since there is a correction term, the outlier can be obtained more accurately.

【０３１４】第６の発明における計算式によれば、第１
の項目に分散を含んでいるため、最も外れている値を除
くと分散が小さくなるという性質を利用できる。According to the calculation formula of the sixth invention, the first
Since the item of includes the variance, it is possible to use the property that the variance becomes smaller when the most deviated value is excluded.

【０３１５】第７の発明における計算式によれば、第２
の項目に外れ値の候補の個数を含んでいるため、外れ値
の個数を増やしていったことによる第１の項目の減少傾
向を相殺できる。According to the calculation formula in the seventh invention, the second
Since the item includes the number of outlier candidates, the decreasing tendency of the first item due to increasing the number of outliers can be offset.

【０３１６】第８の発明における計算式によれば、第２
の項目に係数を乗算していることにより、第２の項目の
増加量を調節することができる。According to the calculation formula in the eighth invention, the second
By multiplying the item by the coefficient, the increase amount of the second item can be adjusted.

【０３１７】第９の発明における計算式によれば、分散
の減少傾向を係数により、調節できる。According to the calculation formula of the ninth invention, the decreasing tendency of the dispersion can be adjusted by the coefficient.

【０３１８】第１０の発明における計算式によれば、回
帰分析の式を応用して外れ値の検出を行うことができ
る。According to the calculation formula of the tenth invention, it is possible to detect the outlier by applying the regression analysis formula.

【０３１９】第１１の発明によれば、加工工程があるこ
とによりさまざまなタイプのデータの外れ値を検出する
ことができる。According to the eleventh invention, outliers of various types of data can be detected due to the processing steps.

【０３２０】第１２の発明によれば、時間に依存する値
からも時間に依存しない値に加工することにより、外れ
値を検出することができる。According to the twelfth aspect, an outlier can be detected by processing a time-dependent value into a time-independent value.

【０３２１】第１３の発明によれば、１つのサンプルに
複数の特性値が存在する場合であっても、テコ比を計算
することにより外れ値を求めることができる。According to the thirteenth invention, even if one sample has a plurality of characteristic values, it is possible to obtain the outlier by calculating the lever ratio.

【０３２２】第１４の発明によれば、回帰分析の手法を
適用できる値であれば、回帰分析の残差を求めることに
よりこの残差の外れ値を求めることができる。According to the fourteenth invention, if the value can be applied to the regression analysis method, the residual of the regression analysis can be calculated to obtain the outlier of this residual.

【０３２３】第１５の発明によれば、正準相関分析モデ
ルを適用できる場合でも、外れ値を求めることができ
る。According to the fifteenth invention, even if the canonical correlation analysis model can be applied, the outlier can be obtained.

【０３２４】第１６の発明によれば、複数のグループに
特性値が分類され、判別分析を行うことができる場合、
外れ値を求めることができる。According to the sixteenth invention, when the characteristic values are classified into a plurality of groups and the discriminant analysis can be performed,
Outliers can be determined.

【０３２５】第１７の発明によれば、上記のような外れ
値検出方法を利用することにより、外れ値を容易に検出
することができるデータ処理装置を得ることができる。
この外れ値が得られた時の環境条件を検討することによ
り、新たな知見・情報が得られることにもなる。According to the seventeenth invention, it is possible to obtain a data processing device which can easily detect an outlier by using the above outlier detection method.
By investigating the environmental conditions when this outlier is obtained, new knowledge and information can be obtained.

【０３２６】第１８の発明によれば、上記のような外れ
値検出方法を利用することにより、外れ値を容易に検出
し、除くことができるデータ処理装置を得ることができ
る。このデータ処理装置により得られた結果は、外れ値
を除いてあるので信頼性が向上している。According to the eighteenth invention, by utilizing the above outlier detection method, it is possible to obtain the data processing device which can easily detect and remove the outlier. Outliers are excluded from the results obtained by this data processing device, so that the reliability is improved.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の外れ値検出方法を説明するためのフロ
ーチャート図である。FIG. 1 is a flowchart for explaining an outlier detection method of the present invention.

【図２】Ｇｒｕｂｂｓのデータ１を用いた場合の外れ値
の候補の個数ｓと検出統計量Ｕｔの関係を示す図であ
る。FIG. 2 is a diagram showing the relationship between the number s of outlier candidates and the detection statistic Ut when the Grubbs data 1 is used.

【図３】別のデータを用いた場合の外れ値の候補の個数
ｓと検出統計量Ｕｔの関係を示す図である。FIG. 3 is a diagram showing a relationship between the number s of outlier candidates and the detection statistic Ut when different data is used.

【図４】本発明のデータ処理装置の構成図である。FIG. 4 is a configuration diagram of a data processing device of the present invention.

【図５】本発明の一実施例の入力データをプロットした
図である。FIG. 5 is a diagram plotting input data according to an embodiment of the present invention.

【図６】本発明の外れ値検出のための工程を説明する図
である。FIG. 6 is a diagram illustrating a step for detecting an outlier according to the present invention.

【図７】本発明の一実施例の時間とともに増加する傾向
を持つデータをプロットした図である。FIG. 7 is a plot of data having a tendency to increase with time according to an embodiment of the present invention.

【図８】本発明の一実施例の時間とともに増加する傾向
を除いたデータをプロットした図である。FIG. 8 is a plot of data excluding a tendency of increasing with time according to an example of the present invention.

【図９】本発明の一実施例の入力データをプロットした
図である。FIG. 9 is a diagram plotting input data according to an embodiment of the present invention.

【図１０】回帰分析説明変数選択基準の数式を示す図で
ある。FIG. 10 is a diagram showing a mathematical formula for a regression analysis explanatory variable selection criterion.

【図１１】本発明の実施例の中で使われるデータをプロ
ットした図である。FIG. 11 is a plot of data used in examples of the present invention.

【図１２】従来の技術及び本発明の実施例の中で使われ
るデータをプロットした図である。FIG. 12 is a plot of data used in the prior art and examples of the present invention.

【図１３】従来の技術及び本発明の実施例で使われる装
置の構成図である。FIG. 13 is a block diagram of an apparatus used in a conventional technique and an embodiment of the present invention.

【図１４】従来の外れ値検出方式を説明するためのフロ
ーチャート図である。FIG. 14 is a flow chart diagram for explaining a conventional outlier detection method.

【符号の説明】[Explanation of symbols]

１情報処理装置２コンピュータ（ＦＤＤ付き）３ディスプレイ・ユニット４プリンタ５キーボード６フロッピーディスク 1 Information processing device 2 Computer (with FDD) 3 Display unit 4 Printer 5 Keyboard 6 Floppy disk

Claims

【特許請求の範囲】[Claims]

【請求項１】以下の工程を有する外れ値検出方法（ａ）Ｎ個（Ｎ≧３）の値を入力する入力工程、（ｂ）
上記入力工程により入力したＮ個の値の大小関係を判定
する大小判定工程、（ｃ）上記大小判定工程により判定
された大小関係に基づき、Ｎ個の値の組み合せ及び外れ
値の候補を除いたＮ個未満の値の組み合せを求め、求め
た組み合せに対して所定の計算式を用いて検出統計量を
算出する算出工程、（ｄ）上記算出工程により算出され
た検出統計量に基づいて、外れ値を検出する外れ検出工
程。1. An outlier detection method comprising the following steps: (a) an input step of inputting N (N ≧ 3) values; (b)
A magnitude determination step of determining the magnitude relationship of the N values input by the input step, (c) A combination of N values and outlier candidates are removed based on the magnitude relationship determined by the magnitude determination step. A calculation step of obtaining a combination of values less than N and calculating a detection statistic for the obtained combination using a predetermined calculation formula, (d) a deviation based on the detection statistic calculated by the calculation step. An outlier detection step of detecting a value.

【請求項２】上記算出工程は、ｓ個以内の外れ値を検
出する場合、大小判定工程により判定された大小関係上
連続するｎ個（ｎ＝Ｎ−ｓ）以上の値の複数の組み合せ
を用いて検出統計量を算出することを特徴とする請求項
１記載の外れ値検出方法。2. The above-mentioned calculation step, when detecting an outlier within s, determines a plurality of combinations of n (n = N−s) or more consecutive values in the magnitude relationship determined by the magnitude determination step. The outlier detection method according to claim 1, wherein the detection statistic is calculated by using the detection statistic.

【請求項３】上記外れ値検出工程は、Ｎ個未満の値の
組み合せから求めた検出統計量の中で最小のものを選択
する最小値選択工程と、選択された最小値がＮ個の値の
組み合せから求めた検出統計量よりも小さい場合に、そ
の選択された最小値を算出した組み合せに含まれていな
かった値を外れ値とする外れ値判定工程を備えたことを
特徴とする請求項１又は２記載の外れ値検出方法。3. The outlier detecting step comprises a minimum value selecting step of selecting the smallest detection statistic obtained from a combination of values less than N, and a selected minimum value having N values. When it is smaller than the detection statistic obtained from the combination of, the selected minimum value is provided as an outlier in the value which was not included in the calculated combination. The outlier detection method according to 1 or 2.

【請求項４】上記計算式は、外れ値の候補が除かれる
と小さくなる傾向にある第１の項目と、外れ値の候補が
除かれると大きくなる第２の項目とを有し、上記算出工
程は、第１と第２の項目の値を算出し両者の和により検
出統計量を求めることを特徴とする請求項１、２又は３
記載の外れ値検出方法。4. The calculation formula has a first item that tends to become smaller when outlier candidates are removed, and a second item that becomes larger when outlier candidates are removed, 4. The step calculates the value of the first and second items and obtains the detection statistic by the sum of the two.
Outlier detection method described.

【請求項５】上記計算式は、更に、第１と第２の項目
以外に、第１と第２の項目を補正する補正項を有し、上
記算出工程は、第１と第２と第３の項目の値を算出し、
３者の和により検出統計量を求めることを特徴とする請
求項４記載の外れ値検出方法。5. The calculation formula further includes a correction term for correcting the first and second items in addition to the first and second items, and the calculation step includes the first, second and Calculate the value of item 3
The outlier detection method according to claim 4, wherein the detection statistic is obtained by the sum of the three.

【請求項６】上記第１の項目は、検出統計量を求める
Ｎ個未満の値の分散を用いていることを特徴とする請求
項４又は５記載の外れ値検出方法。6. The outlier detection method according to claim 4, wherein the first item uses a variance of less than N values for obtaining a detection statistic.

【請求項７】上記第２の項目は、検出統計量を求める
場合の外れ値の候補の個数を用いていることを特徴とす
る請求項４又は５記載の外れ値検出方法。7. The outlier detection method according to claim 4, wherein the second item uses the number of outlier candidates in the case of obtaining the detection statistic.

【請求項８】上記第２の項目は、外れ値の候補の個数
に対して所定の係数を乗算したものを用いることを特徴
とする請求項７記載の外れ値検出方法。8. The outlier detection method according to claim 7, wherein the second item uses a value obtained by multiplying the number of candidates for outliers by a predetermined coefficient.

【請求項９】上記計算式は、検出統計量を求めるＮ個
未満の値の分散と分散に対する係数を有しており、上記
算出工程は、分散と係数の乗算により検出統計量を求め
ることを特徴とする１、２又は３記載の外れ値検出方
法。9. The calculation formula has a variance of values less than N for obtaining a detection statistic and a coefficient for the variance, and the calculating step calculates the detection statistic by multiplying the variance by the coefficient. The outlier detection method according to the above item 1, 2, or 3.

【請求項１０】上記計算式は、回帰分析の変数選択基
準を基礎にして作成されることを特徴とする請求項１〜
８又は９記載の外れ値検出方法。10. The calculation formula is created based on a variable selection criterion of regression analysis.
8. The outlier detection method according to 8 or 9.

【請求項１１】上記外れ値検出方法は、更に、入力工
程と大小判定工程の間に、入力した値を加工する加工工
程を備えたことを特徴とする請求項１記載の外れ値検出
方法。11. The outlier detection method according to claim 1, further comprising a processing step of processing the input value between the input step and the magnitude determination step.

【請求項１２】上記加工工程は、入力工程により入力
された時間に依存する値を時間に依存しない値に加工す
ることを特徴とする請求項１１記載の外れ値検出方法。12. The outlier detection method according to claim 11, wherein in the processing step, the time-dependent value input by the input step is processed into a time-independent value.

【請求項１３】上記加工工程は、入力工程により入力
された値からテコ比を計算することを特徴とする請求項
１１記載の外れ値検出方法。13. The outlier detection method according to claim 11, wherein the processing step calculates the lever ratio from the value input in the input step.

【請求項１４】上記加工工程は、入力工程により入力
された値から回帰分析モデルのデータを計算することを
特徴とする請求項１１記載の外れ値検出方法。14. The outlier detection method according to claim 11, wherein in the processing step, the data of the regression analysis model is calculated from the value input in the input step.

【請求項１５】上記加工工程は、入力工程により入力
された値から正準相関分析モデルのデータを計算するこ
とを特徴とする請求項１１記載の外れ値検出方法。15. The outlier detection method according to claim 11, wherein in the processing step, the data of the canonical correlation analysis model is calculated from the value input in the input step.

【請求項１６】上記加工工程は、入力工程により入力
された値が複数のグループに分類されていて複数の要因
により判別分析を行う場合に、各グループの判別関数値
を計算することを特徴とする請求項１１記載の外れ値検
出方法。16. The processing step is characterized by calculating a discriminant function value of each group when the values input by the input step are classified into a plurality of groups and a discriminant analysis is performed by a plurality of factors. The outlier detection method according to claim 11.

【請求項１７】上記請求項１〜１５又は１６記載の外
れ値検出方法を実行して外れ値を検出する外れ値検出手
段と、Ｎ個の値を計測して外れ地検出手段に入力する計
測手段と、外れ値検出手段により検出された外れ値を知
らせる出力手段を備えたデータ処理装置。17. Outlier detection means for executing the outlier detection method according to claim 1 to 15 or 16 to detect outliers, and measurement for measuring N values and inputting them to the outlier detection means. A data processing device comprising: means and an output means for notifying an outlier detected by the outlier detecting means.

【請求項１８】上記データ処理装置は、更に、外れ値
検出手段により検出された外れ値を除いた残りの値を用
いて所定の処理を実行するデータ処理手段を備えたこと
を特徴とする請求項１７記載のデータ処理装置。18. The data processing apparatus further comprises data processing means for executing a predetermined process using the remaining value excluding the outlier detected by the outlier detecting means. Item 17. The data processing device according to item 17.