JP3337504B2

JP3337504B2 - neural network

Info

Publication number: JP3337504B2
Application number: JP27287192A
Authority: JP
Inventors: 武居利治; 竹村安弘
Original assignee: Sumitomo Osaka Cement Co Ltd
Current assignee: Sumitomo Osaka Cement Co Ltd
Priority date: 1992-10-12
Filing date: 1992-10-12
Publication date: 2002-10-21
Anticipated expiration: 2017-10-21
Also published as: JPH06124354A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、物体、画像、文字など
の識別及び情報の検索、推論、連想などの情報処理の分
野において利用されるニューラルネットワークの構成に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to the construction of a neural network used in the field of information processing such as identification of objects, images, characters, etc. and information retrieval, inference and association.

【０００２】[0002]

【従来の技術】従来より、画像や音声などの認識、連
想、推論など情報処理のノイマン型コンピュータが不得
意な情報処理を効果的に行なう方法として、ニューラル
ネットワーク技術が注目されている。特に、ニューラル
ネットワークの学習方法である誤差逆伝搬学習法（以
下、ＥＢＰと略称する）を用いて、ネットワーク内のユ
ニット（ニューロンをモデルとした通常多入力１出力の
信号処理素子）間の結合荷重値を決定することにより、
各入力パターンの特徴を捉えたフレキシブルな認識を行
なうことができる。2. Description of the Related Art Conventionally, neural network technology has been attracting attention as a method of effectively performing information processing that Neumann-type computers, such as recognition, association, and inference of images and sounds, are not good at. In particular, by using a back propagation learning method (hereinafter abbreviated as EBP) which is a learning method of a neural network, a connection weight between units (usually a multi-input, one-output signal processing element using a neuron as a model) in the network. By determining the value,
Flexible recognition that captures the characteristics of each input pattern can be performed.

【０００３】図２は、前記のような従来の代表的ニュー
ラルネットワークの構成を示す構成図である。ＥＢＰ法
による学習を行なうニューラルネットワークでは、通
常、ネットワーク内のユニットの集合を３層以上の層構
造とし、そのうちで、パターンの信号入力を行なう層を
入力層、認識出力信号を出力する層を出力層、残りを中
間層又は隠れ層と称する。このとき、原則として、同一
層内に含まれるユニット同士の結合は、ないものとし、
また、出力に近い層から入力に近い層への信号のフィー
ドバックは、ないものとする。FIG. 2 is a configuration diagram showing a configuration of a conventional typical neural network as described above. In a neural network that performs learning by the EBP method, a set of units in the network usually has a layer structure of three or more layers. Among them, a layer for inputting a pattern signal is an input layer, and a layer for outputting a recognition output signal is output. The layers and the rest are called intermediate layers or hidden layers. At this time, in principle, there is no connection between units included in the same layer,
Also, it is assumed that there is no signal feedback from a layer close to the output to a layer close to the input.

【０００４】さて、入力層の各ユニットから出力された
出力層Ｉ_n(ｎ：ユニット番号）は、中間層の各ユニット
への伝搬し、更に中間層の各ユニットから出力された出
力値Ｃ_n(ｎ：ユニット番号）は、出力層の各ユニットへ
と伝搬し、この出力層の各ユニットから出力値Ｏn(ｎ：
ユニット番号）が得られる。但し、図２においては、入
力層及び中間層ユニットの出力が複数のユニットと結合
しているので、多出力のように見えるが、これは、１つ
のユニットからの出力を複数のユニットに入力している
ためで、出力値は、１つのユニットにつき１つのみであ
る。図２において、各層のユニット間の結合は、矢印で
示されており、入力層のユニットと中間層ユニットとの
結合荷重値を対応するユニット番号を、添え字として付
けたＶ_ij（ｉは中間層ユニット番号、ｊは入力層ユニッ
ト番号である）で、中間層のユニットと出力層のユニッ
トとの結合荷重値を対応するユニット番号を添え字とし
て付けたＷ_ij（ｉは出力層ユニット番号、ｊは中間層ユ
ニット番号である）で表すものとする。例えば、中間層
ユニット３と出力層ユニット５との結合荷重値はＷ₅₃と
表示される。The output layer I _n (n: unit number) output from each unit of the input layer propagates to each unit of the intermediate layer, and further, the output value C _n output from each unit of the intermediate layer (n: unit number) propagates to each unit of the output layer, and the output value On (n:
Unit number) is obtained. However, in FIG. 2, since the outputs of the input layer and the intermediate layer unit are combined with a plurality of units, it looks like a multi-output, but this is because the output from one unit is input to the plurality of units. Therefore, there is only one output value per unit. In FIG. 2, the connection between units in each layer is indicated by an arrow, and a unit number corresponding to a connection load value between the unit in the input layer and the unit in the intermediate layer is denoted by a subscript V _ij (i is an intermediate number). W _ij (i is an output layer unit number, i is an output layer unit number) in which a unit weight corresponding to a coupling load value between a unit of the intermediate layer and a unit of the output layer is a subscript. j is an intermediate layer unit number). For example, coupling load value of the intermediate layer unit 3 and the output layer unit 5 is displayed as W _53.

【０００５】通常、中間層ユニット及び出力層ユニット
の入出力特性は、次の式で表される。Ｃ_i＝ｆ(ΣＶ_ijＩ_j＋ξ_i) （１）Ｏ_i＝ｆ(ΣＷ_ijＣ_j＋θ_i) （２）ｆ(ｘ) ＝１／(１＋exp(−ｘ)) （３）ここで、ｉ、ｊは各々、前記のように対応するユニット
番号であり、ξ、θは、各々のユニットにおけるバイア
ス値である。また、（３）に示した関数は、シグモイド
関数と呼ばれている。Generally, the input / output characteristics of the intermediate layer unit and the output layer unit are expressed by the following equations. C _i = f (ΣV _ij I _j + ξ _i ) (1) O _i = f (ΣW _ij C _j + θ _i ) (2) f (x) = 1 / (1 + exp (−x)) (3) i and j are the corresponding unit numbers as described above, and ξ and θ are the bias values in each unit. The function shown in (3) is called a sigmoid function.

【０００６】前記のようなニューラルネットワークの構
成において、任意の入力パターン即ち入力層ユニットの
出力ベクトル（Ｉ₁、Ｉ₂、Ｉ₃、Ｉ₄、Ｉ₅）を与えたと
きの出力層からの所望の出力と、実際の出力Ｏ_i（ここ
では、ｉ＝１、２、・・５）との各ユニットの誤差の２
乗和が小さくなるように、各ユニット間の結合荷重値の
大きさを修正することにより、任意の入力パターンに対
して所望の出力パターンを出力するニューラルネットワ
ークを得ることができる。ＥＢＰ法は、このときの荷重
値の修正係数を計算する方法であり、一般化δルールと
も呼ばれている。この修正係数を用いて効率良く学習を
収束させるために、いくつかの方法が提案されている。In the above-described neural network configuration, when an arbitrary input pattern, that is, an output vector (I ₁ , I ₂ , I ₃ , I ₄ , I ₅ ) of the input layer unit is given, a desired signal from the output layer is given. And the actual output O _i (here, i = 1, 2,..., 5) of the error of each unit.
By modifying the magnitude of the coupling load value between the units so that the sum of the squares is reduced, it is possible to obtain a neural network that outputs a desired output pattern for an arbitrary input pattern. The EBP method is a method of calculating a correction coefficient of a load value at this time, and is also called a generalized δ rule. Several methods have been proposed to efficiently converge the learning using the correction coefficient.

【０００７】ところが、前記のニューラルネットワーク
においては、認識したいクラスに属する代表的なパター
ンをいくつか提示して学習を行って、結合荷重値を決定
させようとしても、学習を収束するために、多数回の繰
り返し学習を行なうため、ネットワークの構成に時間が
かかり、しかも、必ずしも、学習が収束するとは限ら
ず、所謂、ローカルミニマに落ち込み、不十分な認識結
果を与えることがよく生じる。また、このことは、入力
パターン相互の兼ね合いや、中間層のユニット数によっ
ても左右され、最適な中間ユニット数が試行錯誤的に与
えられることが多かった。このことは、ニューラルネッ
トワークの基本式が（１）〜（３）に与えられているこ
とに起因する。即ち、ある入力パターン間の特徴を反映
した特徴空間を（１），（２）で表されるようなＮ次元
の面で分割し、識別領域を学習により作成しているため
である。従って、その解は、非常に多くあり、一義的に
決定されるものではない。このことにより、学習時にお
いていくつかのパターンを識別することが困難であった
り、例え、学習時に使用したパターン間の認識ができた
としても他のパターンを認識するときに誤認識するなど
の結果を生じることがあった。また、非常に類似したパ
ターンを識別することも苦手であった。また、一度荷重
値が決定されると、固定化されてしまうので、学習時
に、例えば、時間的に変動して行くような入力に対して
も対応できるように学習セットを用意するなどして、対
応していた。また、汎化の問題にしても、どのように汎
化させるか予め予測した上で、学習セットを用意させる
必要があった。即ち、特徴空間内に空き領域がないため
に、未学習の入力パターンでもいずれかの学習済みの入
力パターンに識別してしまうという欠点があった。[0007] However, in the above-described neural network, a number of typical patterns belonging to a class to be recognized are presented and learning is performed. Since iterative learning is performed twice, it takes time to construct a network, and the learning does not always converge. It often falls into a so-called local minimum and gives an insufficient recognition result. Further, this depends on the balance between the input patterns and the number of units in the intermediate layer, and the optimum number of intermediate units is often given by trial and error. This is because the basic expressions of the neural network are given in (1) to (3). That is, the feature space reflecting the features between certain input patterns is divided into N-dimensional planes represented by (1) and (2), and the identification area is created by learning. Therefore, the solutions are numerous and are not uniquely determined. As a result, it is difficult to identify some patterns at the time of learning, or even if it is possible to recognize between patterns used at the time of learning, it is erroneously recognized when recognizing other patterns. May occur. Also, it was difficult to identify very similar patterns. Also, once the load value is determined, it is fixed, so at the time of learning, for example, by preparing a learning set so as to be able to respond to an input that fluctuates with time, Was compatible. Further, even in the problem of generalization, it is necessary to prepare a learning set after predicting how to generalize in advance. That is, since there is no free area in the feature space, an unlearned input pattern is identified as any of the learned input patterns.

【０００８】[0008]

【発明が解決しようとする課題】本発明は、上記の問題
点を解決するためになされたもので、誤認識が起こり難
く、非常に類似したパターンでも認識することができ、
また、入力の変動に対しても簡単に更新手続きで実質的
に荷重値を更新させることができるニューラルネットワ
ークを提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made in order to solve the above-mentioned problems, and it is difficult for erroneous recognition to occur, and even very similar patterns can be recognized.
It is another object of the present invention to provide a neural network that can easily update a load value by an update procedure easily even when an input changes.

【０００９】[0009]

【課題を解決するための手段】本発明は、上記の技術的
な課題の解決のためになされたもので、本発明のニュー
ラルネットワークにおいては、少なくてもＮ個の入力ユ
ニットを有する入力層と前記入力層の各ユニットに対応
したＮ個の中間ユニットを前記出力カテゴリーに対応し
てＫ組揃えた中間ユニット群及びバイアスを与えるユニ
ットを有する中間層とＫ個の出力ユニットを有する第１
の出力層とから構成され、前記入力層の各ユニットと前
記中間ユニット群との間の結線の荷重値は、全て１で、
前記中間ユニット群の各中間ユニットの入出力特性は、
対応する出力カテゴリーに属する入力パターン群のＮ個
の各要素毎の代表値とバラツキを示す統計的な量に基づ
き作成された凸のメンバーシップ関数で表され、前記中
間層と前記第１の出力層との間の結線の荷重値は、前記
メンバーシップ関数から出力されるメンバーシップ値を
前記中間ユニット群の出力として、δルールに基づき学
習により決定することを特徴とする前記ニューラルネッ
トワークである。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned technical problems. In the neural network of the present invention, an input layer having at least N input units is provided. An intermediate layer having K units of N intermediate units corresponding to each unit of the input layer corresponding to the output category and a first layer having an intermediate layer having a unit for giving a bias and K output units
The output value of the connection layer between each unit of the input layer and the intermediate unit group are all 1,
The input / output characteristics of each intermediate unit in the intermediate unit group are as follows:
The intermediate layer and the first output are represented by a convex membership function created based on a statistical value indicating a representative value and variation of each of the N elements of the input pattern group belonging to the corresponding output category. The neural network is characterized in that the weight value of the connection between the layer and the layer is determined by learning based on the δ rule using the membership value output from the membership function as the output of the intermediate unit group.

【００１０】また、少なくてもＮ個の入力ユニットを有
する入力層と前記入力層の各ユニットに対応したＮ個の
中間ユニットを前記出力カテゴリーに対応してＫ組揃え
た中間ユニット群及びバイアスを与えるユニットを有す
る中間層とＫ個の出力ユニットを有する第１の出力層と
第２の出力層とから構成され、前記入力層の各ユニット
と前記中間ユニット群との間の結線の荷重値は、全て１
で、前記中間ユニット群の各中間ユニットの入出力特性
は、対応する出力カテゴリーに属する入力パターン群の
Ｎ個の各要素毎の代表値とバラツキを示す統計的な量に
基づき作成された凸のメンバーシップ関数で表され、前
記中間層と前記第１の出力層との間の結線の荷重値は、
前記メンバーシップ関数から出力されるメンバーシップ
値を前記中間ユニット群の出力として、δルールに基づ
き学習により決定され、前記第２の出力層からはＫ個の
出力カテゴリーに対応する各々の前記中間ユニット内の
Ｎ個の要素が比較され、最も小さい値が出力カテゴリー
毎に出力され、ある入力に対し、前記第１の出力層が出
力するカテゴリーの真なる程度を、対応する該カテゴリ
ーに対して前記第２の出力層が出力する最も小さい値の
程度で評価するニューラルネットワークを提供する。An input unit having at least N input units, and an intermediate unit group and a bias in which K sets of N intermediate units corresponding to each unit of the input layer corresponding to the output category are provided. A first output layer having K units and a second output layer having K output units, and a load value of a connection between each unit of the input layer and the intermediate unit group is , All 1
The input / output characteristics of each of the intermediate units of the intermediate unit group are represented by a convex value created on the basis of a statistical value indicating a variation and a representative value of each of the N elements of the input pattern group belonging to the corresponding output category. Expressed by a membership function, the load value of the connection between the intermediate layer and the first output layer is
Each of the intermediate units corresponding to K output categories is determined by learning based on the δ rule, with the membership value output from the membership function as the output of the intermediate unit group, and from the second output layer. Are compared and the smallest value is output for each output category. For a given input, the true degree of the category output by the first output layer is determined for the corresponding category. A neural network is provided that evaluates to the degree of the smallest value output by the second output layer.

【００１１】中間ユニット群の各中間ユニットの入出力
特性は、前記第１の出力層或いは第２の出力層により評
価された第１の出力層の出力する値が最も大きいカテゴ
リーに対応する中間ユニットに対し、前記代表値とバラ
ツキを示す値を更新することにより変化させられるもの
が好適である。そして、バイアスを与えるユニットは、
Ｋ組揃えた中間ユニット群の各組の中間ユニットに一つ
ずつ付与し、中間層と第１の出力層との間の結線は、出
力カテゴリーに対応した中間ユニット及びバイアスを与
えるユニットと出力カテゴリーに対応する第１の出力層
のユニットのみと結合させるものが好適である。また、
バイアスを与えるユニットは、中間ユニット群に１つ付
与し、中間層と第１の出力層との間の結線は、中間層に
ある全ての中間ユニット及びバイアスを与えるユニット
と出力ユニットに対応する第１の出力層のユニットの全
てと結合させるものが好適である。更に、対応する出力
カテゴリーに属する入力パターン群のＮ個の各要素の代
表値とバラツキを示す量として、標準偏差或いは分散等
の統計学的な量を用い、バラツキを示す量として、標準
偏差或いは分散等の統計学的な量を用いると好適であ
る。[0011] The input / output characteristics of each intermediate unit of the intermediate unit group are determined by the intermediate unit corresponding to the category having the largest output value of the first output layer evaluated by the first output layer or the second output layer. In contrast, it is preferable that the value can be changed by updating the value indicating the variation with the representative value. And the biasing unit is
The intermediate units of each set of the K intermediate unit groups are provided one by one, and the connection between the intermediate layer and the first output layer is made up of the intermediate unit corresponding to the output category, the unit providing the bias, and the output category. It is preferable to combine only the unit of the first output layer corresponding to. Also,
One biasing unit is applied to the intermediate unit group, and the connection between the intermediate layer and the first output layer is connected to all the intermediate units and the biasing unit and the output unit in the intermediate layer. Those which are combined with all the units of one output layer are preferable. Further, a statistical value such as a standard deviation or a variance is used as an amount indicating the variation and the representative value of each of the N elements of the input pattern group belonging to the corresponding output category, and a standard deviation or It is preferable to use a statistical amount such as variance.

【００１２】[0012]

【作用】前記のＮ次元の入力ベクトルとＫ個の出力カテ
ゴリーを有するニューラルネットワークにおいて、入力
層の各ユニットと中間ユニット群との間の結線の荷重値
は、全て１とする。次に、あるカテゴリーに属させたい
Ｎ次元のベクトルよりなる複数の入力パターンの有する
各々のベクトルの各要素の代表値（例えば平均値や中央
値等の統計量）とバラツキを示す量（例えば標準偏差や
分散等のバラツキを示す量）を基準にしてメンバーシッ
プ関数が作成され、これを前記中間ユニット群の各中間
ユニットの入出力特性の関数とすることにより、ある入
力のＮ次元のベクトルの要素がＫ個の出力カテゴリー毎
に０−１のメンバーシップ値に変換される。従って、入
力値がどのようなものであっても、０−１のメンバーシ
ップ値に変換される。このメンバーシップ値は、各々の
出力カテゴリーを代表するベクトルの各要素とどの程度
マッチングしているかを示す量である。In the neural network having the above-mentioned N-dimensional input vector and K output categories, the weight of the connection between each unit of the input layer and the intermediate unit group is all set to one. Next, a representative value (for example, a statistic such as an average value or a median value) of each element of each of a plurality of input patterns composed of N-dimensional vectors to be assigned to a certain category and a quantity indicating a variation (for example, a standard value). A membership function is created on the basis of the amount of variation such as deviation or variance, and is used as a function of the input / output characteristics of each intermediate unit of the intermediate unit group. Elements are converted to 0-1 membership values for each of the K output categories. Therefore, whatever the input value is, it is converted to a membership value of 0-1. This membership value is a quantity indicating the degree of matching with each element of the vector representing each output category.

【００１３】さて、これらのＮ個の要素を有するＫ個の
ベクトル群の各々のベクトルは、出力カテゴリーの各々
に対応しているので、各々のベクトルにバイアスとなる
要素を加えて、Ｎ＋１個の要素と対応する出力カテゴリ
ーを結線させるか、又は、Ｎ個の要素を有するＫ個のベ
クトル群にバイアスとなる要素を加えたＮ×Ｋ＋１個の
要素と各々の出力カテゴリーを結線させて、ある入力に
対して望ましい出力が得られるようにこの結線の荷重値
をδルールにより学習させて、中間層と第１の出力層と
の結合荷重値を得る。このときの入力は、メンバーシッ
プ関数を作成したときに使用した学習セットでも良い
し、異なるものであっても良い。このようにすると、メ
ンバーシップ関数により、入力ベクトルの各要素の各出
力カテゴリーに属する程度が表わされるので、非常に複
雑な問題に対しても、非常に効率よく学習することがで
きるのである。また、第２の出力層では、中間層のバイ
アスを除いたＮ個の要素を有するＫ個のベクトル群の各
々の要素の最小値を各カテゴリーに対応してＫ個出力さ
せる。Now, since each of the K vector groups having N elements corresponds to each of the output categories, a biasing element is added to each vector to obtain N + 1 number of elements. An element is connected to an output category corresponding thereto, or N × K + 1 elements obtained by adding a bias element to a K vector group having N elements are connected to each output category to generate an input. The connection load value between the intermediate layer and the first output layer is obtained by learning the load value of this connection according to the δ rule so that a desired output is obtained. The input at this time may be the learning set used when the membership function was created, or may be different. In this way, the degree to which each element of the input vector belongs to each output category is represented by the membership function, so that a very complicated problem can be learned very efficiently. In the second output layer, the minimum value of each element of the K vector group having N elements, excluding the bias of the intermediate layer, is output K corresponding to each category.

【００１４】さて、前に説明したように、中間層のバイ
アスを除いたＮ個の要素を有するＫ個のベクトル群の各
々の要素は、各カテゴリーに属する程度を示しているの
で、Ｎ個の要素の中の最小値（出力カテゴリーに対応し
てＫ個ある）は、入力ベクトルのある要素が各々の出力
カテゴリーに対応する要素と最もマッチングしない程度
を示している。従って、この値が非常に小さければ、統
計的に処理された領域からはずれ、出力カテゴリーとし
てふさわしくない状態を示しているということができ
る。この第２の出力層からの出力値を用いると、第１の
出力層により出力された結果の真なる程度を評価するこ
とができる。As described above, each element of the K vector group having N elements excluding the bias of the intermediate layer indicates the degree of belonging to each category. The minimum value among the elements (there are K corresponding to the output categories) indicates the degree to which an element of the input vector does not match the element corresponding to each output category the least. Therefore, if this value is very small, it can be said that the state deviates from the statistically processed area and is not suitable for the output category. By using the output value from the second output layer, the true degree of the result output from the first output layer can be evaluated.

【００１５】即ち、第１の出力層からの出力値が大きく
とも、第２の出力層からの出力層が極めて小さければ、
疑わしい出力だと判断することができる。従って、誤認
識をする可能性を低減させることができる。また、第２
の出力層からの出力値も第１の出力層からの出力値も大
きければ、入力ベクトルがそのカテゴリーに属する程度
は確実であるので、中間ユニット群の各中間ユニットの
入出力特性の関数の持つパラメータを更新することがで
きる。即ち、平均値や中央値或いは標準偏差や分散の量
は、前回までの学習に使用した入力パターン数を覚えて
おけば、簡単に数学的に更新することができるからであ
る。このようにすると、何らかの原因で環境が変化し、
入力が学習時と異なっても変化が急速でなければ、対応
していくことができる。That is, even if the output value from the first output layer is large, if the output layer from the second output layer is extremely small,
It can be determined that the output is suspicious. Therefore, the possibility of erroneous recognition can be reduced. Also, the second
If both the output value from the output layer and the output value from the first output layer are large, the degree to which the input vector belongs to the category is certain, so that the function of the input / output characteristic of each intermediate unit of the intermediate unit group has Parameters can be updated. That is, the average value, the median value, the standard deviation, and the amount of variance can easily be mathematically updated by remembering the number of input patterns used in the previous learning. If you do this, the environment changes for some reason,
Even if the input is different from that at the time of learning, if the change is not rapid, it can respond.

【００１６】次に、本発明のニューラルネットワークに
ついて更に詳しく具体的に以下実施例より、説明する
が、本発明がそれらによって、制限されるものではな
い。Next, the neural network of the present invention will be described in more detail with reference to the following examples, but the present invention is not limited thereto.

【００１７】[0017]

【実施例１】パターン識別を行なうためには、始めに画
像、文字、音声などの識別したいパターンから、そのパ
ターンの特徴を表すデータを取得する。本実施例では、
入力パターンより２つのデータを取得して、このデータ
をコンピュータに入力し、以後の処理は、コンピュータ
上で行なう。図１は、本発明のニューラルネットワーク
の１実施例の構成を示す構成図である。さて、本発明の
第１の特徴的なことは、各層間の結線のされ方であり、
中間層のバイアスを与えるユニット以外の各ユニット
は、入力層の各ユニットと１対１の結合をすることと、
入力層のユニット数と出力層の出力ユニット数が決まれ
ば、自ずと、中間ユニットの数が決定されるということ
である。即ち、入力ベクトルの要素数をＮ個とし、出力
カテゴリー数をＫ個とすると、入力層の各ユニットに対
応したＮ個の中間ユニットを出力カテゴリーに対応し
て、Ｋ個揃えた中間ユニット群及びバイアスを与えるユ
ニットを各中間ユニットに一つずつ付与した総計（Ｎ＋
１）×Ｋ個のユニットを有する中間層よりなる。Embodiment 1 In order to identify a pattern, first, data representing the characteristics of the pattern to be identified, such as images, characters, and sounds, is obtained. In this embodiment,
Two pieces of data are obtained from the input pattern, and this data is input to a computer, and the subsequent processing is performed on the computer. FIG. 1 is a configuration diagram showing a configuration of an embodiment of a neural network according to the present invention. By the way, the first characteristic of the present invention is how to connect the layers.
Each unit other than the unit for applying the bias of the intermediate layer has a one-to-one connection with each unit of the input layer;
If the number of units in the input layer and the number of output units in the output layer are determined, the number of intermediate units is naturally determined. That is, assuming that the number of elements of the input vector is N and the number of output categories is K, an intermediate unit group in which N intermediate units corresponding to each unit of the input layer are aligned with K units corresponding to the output category, and A total of (N +
1) An intermediate layer having × K units.

【００１８】さて、上記の内容を図１を用いて説明す
る。ここで、入力ベクトルは、説明を簡単にするため
に、２つの要素からなるものとし、出力カテゴリーの数
を３とする。先ず、入力層ユニットＩ₁とＩ₂は、各々中
間層のユニットＣ₁、Ｃ₄、Ｃ₇とＣ₂、Ｃ₅、Ｃ₈にしか結
合されておらず、また、中間層のユニットＣ₃、Ｃ₆、Ｃ
₉はバイアスを与えるユニットである。従って、入力層
とは結合されていない。次に、中間層と出力層のユニッ
ト間の結合は、一つの出力カテゴリーの出力ユニットＯ
₁を見ると、２つの入力ユニットに対応する中間層の２
つの中間ユニットＣ₁、Ｃ₂とバイアスを与えるＣ₃と結
合させるのみである。さて、本発明の第２の特徴は、ニ
ューラルネットワークの学習のさせ方であり、入力層と
中間層の間の結線の荷重値は、全て１で、中間ユニット
群の各中間ユニットの入出力特性は、出力カテゴリーに
属さない入力パターン群のＮ個の各中間ユニットの入出
力特性は、出力カテゴリーに属さない入力パターン群の
Ｎ個の各要素の代表値とバラツキを示す統計的な量に基
づいて作成された凸のメンバーシップ関数から出力され
るメンバーシップ値が中間層から出力され、このメンバ
ーシップ値をδルールに基づいて学習することである。
上記の内容を、図１を用いて説明する。The above contents will be described with reference to FIG. Here, for simplicity, the input vector is composed of two elements, and the number of output categories is three. First, the input layer units I ₁ and I ₂ are respectively connected only to the units C ₁ , C ₄ , C _{7 of the} intermediate layer and C ₂ , C ₅ , C _8, and the unit C _{3 of the} intermediate layer. , C ₆ , C
₉ is a unit for giving a bias. Therefore, it is not coupled to the input layer. Next, the connection between the units of the hidden layer and the output layer is made by the output unit O of one output category.
Looking at ₁ , the middle layer 2 corresponding to the two input units
It only couples the _three intermediate units C ₁ , C ₂ and the biasing C ₃ . The second feature of the present invention is a method of learning a neural network. The weights of the connections between the input layer and the intermediate layer are all 1 and the input / output characteristics of each intermediate unit in the intermediate unit group. The input / output characteristic of each of the N intermediate units of the input pattern group that does not belong to the output category is based on a statistical value indicating the variation and the representative value of each of the N elements of the input pattern group that does not belong to the output category. The membership value output from the created convex membership function is output from the hidden layer, and this membership value is learned based on the δ rule.
The above contents will be described with reference to FIG.

【００１９】先ず、入力層と中間層との間の結線の荷重
値は、全て１にする。次に、中間ユニット群の各中間ユ
ニットの入出力特性は、各出力カテゴリーに対応する中
間ユニット（図１においては、例えば、第１の出力カテ
ゴリーの出力Ｏ₁に対する中間ユニットＣ₁、Ｃ₂）毎
に、学習するときに用いる対応する出力カテゴリーに属
する入力ベクトルに対する、各要素の平均値ｍと標準偏
差σを算出して、平均値を中心として、上辺の幅が標準
偏差量で、下辺の幅が標準偏差量の３倍程度であり、高
さが１の台形型のメンバーシップ関数やexp(−(ｘ−ｍ)
²／(２σ²)なる正規分布のメンバーシップ関数とする。
例えば、中間ユニットＣ₁及びＣ₂における入出力特性
は、出力Ｏ₁に対応する出力カテゴリーに属する複数の
入力ベクトルの各要素Ｉ₁、Ｉ₂の平均値や標準偏差によ
り計算される。尚、メンバーシップ関数は、凸の連続的
なメンバーシップ関数であればどのようなものであって
もよい。First, all connection load values between the input layer and the intermediate layer are set to 1. Next, the input / output characteristics of each intermediate unit of the intermediate unit group are determined by the intermediate units corresponding to each output category (in FIG. 1, for example, the intermediate units C ₁ and C ₂ for the output O ₁ of the first output category). In each case, the average value m and the standard deviation σ of each element are calculated for the input vector belonging to the corresponding output category used when learning, and the width of the upper side is the standard deviation amount, and the lower side is the A trapezoidal membership function with a width of about three times the standard deviation and a height of 1 or exp (-(x-m)
^{It is} a normal distribution membership function of ² / (2σ ² ).
For example, the input / output characteristics of the intermediate units C ₁ and C ₂ are calculated based on the average value and the standard deviation of the elements I ₁ and I ₂ of a plurality of input vectors belonging to the output category corresponding to the output O ₁ . The membership function may be any function as long as it is a convex continuous membership function.

【００２０】このようにして、入力層と中間層との結合
荷重値を決定した後に、中間層と出力層との結合荷重値
を通常のδルールで決定する。このとき、使用する学習
データは、メンバーシップ関数を作成するときに使用し
た学習サンプルでも良いし、異なったものでも良いが、
入力層と中間層との間の結合荷重値は全て１であり、中
間ユニット群の各中間ユニットの入出力特性は、既に説
明したメンバーシップ関数により決定されているので、
中間層からの出力は、ある入力ベクトルに対して、０−
１までのメンバーシップ値となる。従って、これらのメ
ンバーシップ値を教師付き学習により学習することによ
り、中間層と出力層との間の結合荷重値を決定すること
ができる。本発明のニューラルネットワークによると、
入力層と中間ユニット群との間の結線は、全て１で、各
中間ユニットの入出力特性を、統計的な諸量を用いて決
定することができ、中間層と出力層との間の結線はδル
ールで決定することができるので、極めて簡単になる。
従って、多くのパターンを学習する必要があり、統計的
に特徴が現れるような応用に対しては極めて効果的であ
る。After the coupling weight between the input layer and the intermediate layer is determined in this way, the coupling weight between the intermediate layer and the output layer is determined by the ordinary δ rule. At this time, the training data to be used may be the training sample used when creating the membership function, or may be different.
Since the coupling load values between the input layer and the intermediate layer are all 1, and the input / output characteristics of each intermediate unit of the intermediate unit group are determined by the membership function already described,
The output from the hidden layer is 0-
Membership value up to 1. Therefore, by learning these membership values by supervised learning, the connection weight value between the intermediate layer and the output layer can be determined. According to the neural network of the present invention,
The connection between the input layer and the intermediate unit group is all 1, and the input / output characteristics of each intermediate unit can be determined using various statistical quantities. The connection between the intermediate layer and the output layer Can be determined by the δ rule, which is extremely simple.
Therefore, it is necessary to learn many patterns, which is extremely effective for an application in which features are statistically revealed.

【００２１】即ち、例えば、２つのカテゴリーを分離す
るとき、２つのカテゴリーに対応する入力ベクトルの要
素の平均値が異なると、大まかに、それらを分離する状
態が各中間ユニットの入出力特性の関数として表現さ
れ、メンバーシップ値として表現される。このメンバー
シップ値は、平均値に近いほど１に近くなり、遠いほど
０に近くなる。従って、各メンバーシップ値に基づい
て、中間層と出力層との間の結合荷重値をδルールによ
り学習すると、ある入力に対して出力されるべきカテゴ
リーに対しては、統計的に処理されているために殆ど必
ず全ての入力ベクトルの要素に対して、メンバーシップ
値を有するが、出力されるべきでないカテゴリーに対し
ては、入力ベクトルのある一部の要素に対しては０に近
い値を有する。従って、δルールにより学習されると、
ある入力ベクトルに対して、出力されるべきでないカテ
ゴリーのメンバーシップ値が０に近い要素ほど結合荷重
値はどんなに強化されてもメンバーシップ値自体が０に
近いので、出力されるべきでないカテゴリーには影響を
与えることがなく、その結果、出力されるべきカテゴリ
ーの要素の他のカテゴリーの要素にはない特徴がどんど
ん強化されていくことになる。また、異なるカテゴリー
間で同じようなメンバーシップ値を有する要素では、互
いに、正の結合荷重と負の結合荷重を有するように学習
が進む。このようにして、中間層と出力層との間の結合
荷重値は、結果として、メンバーシップ値の有する特徴
を更に細かく分離するようになるのである。That is, for example, when separating two categories, if the average values of the elements of the input vectors corresponding to the two categories are different, roughly, the state of separating them is a function of the input / output characteristics of each intermediate unit. And expressed as a membership value. This membership value is closer to 1 as it is closer to the average value, and is closer to 0 as it is farther. Therefore, based on each membership value, when the connection weight value between the hidden layer and the output layer is learned by the δ rule, the category to be output for a certain input is statistically processed. Almost always have membership values for all elements of the input vector, but for categories that should not be output, a value close to 0 for some elements of the input vector. Have. Therefore, when learned by the δ rule,
For a certain input vector, the element whose membership value of the category that should not be output is closer to 0 has the membership value itself closer to 0 no matter how the connection weight value is strengthened. It has no effect, and as a result, features that are not included in the elements of other categories in the elements of the category to be output are steadily enhanced. In addition, learning proceeds such that elements having similar membership values between different categories have a positive connection weight and a negative connection weight. In this way, the combined weight value between the middle layer and the output layer results in a finer separation of the features of the membership value.

【００２２】尚、以上説明したニューラルネットワーク
では、ＥＸ−ＯＲ等の線形分離が可能でない問題等への
適応はできない。然し乍ら、線形分離可能な問題に対し
ては、従来の３層構造のニューラルネットワークに比較
すると、中間層のユニット数を試行錯誤的に決める必要
もないし、ローカルミニマ等に落ちることもない。従っ
て、確実に収束させることができる。多くのパターン認
識において、線形分離不可能な問題というのは、そんな
に多く現れるものでもなく、本発明のニューラルネット
ワークに入力する入力ベクトルを、何らかの前処理を施
し、特徴ベクトルとして表されるようにしておくと、非
常に効率的な認識を行なうことができる。また、本実施
例で示した統計的な処理による各中間ユニットの入出力
特性であるメンバーシップ関数の作成の基準に、平均値
の代わりに、入力ベクトルの要素の中央値を取っても良
いし、標準偏差の代わりに、分散を用いても同様なこと
ができることは、言うまでもないことである。The neural network described above cannot be applied to problems such as EX-OR where linear separation is not possible. However, as for the problem that can be linearly separated, there is no need to determine the number of units of the intermediate layer by trial and error, and it does not fall to local minima, as compared with the conventional three-layered neural network. Therefore, the convergence can be ensured. In many pattern recognitions, the problem of linear inseparability does not appear so much, and the input vector input to the neural network of the present invention is subjected to some preprocessing so that it is represented as a feature vector. If this is done, very efficient recognition can be performed. Also, instead of the average value, the median value of the elements of the input vector may be used as the basis for creating the membership function, which is the input / output characteristic of each intermediate unit by the statistical processing shown in this embodiment. It goes without saying that the same can be achieved by using variance instead of standard deviation.

【００２３】[0023]

【実施例２】本実施例では、実施例１で説明したニュー
ラルネットワークに以下に述べる第２の出力層を追加し
て設けることを特徴としている。さて、図３に示すよう
に、入力層と中間層と出力層１の関係は、実施例１と全
く同じであるので、詳細な説明は省略する。入力層と中
間層と出力層２の関係について主に説明する。入力層と
中間層の結線のされ方、並びに結線荷重の決め方、及
び、中間ユニットの入出力特性の関数の決め方は、実施
例１と同様であり、入力ベクトルの要素数をＮ個とし、
出力カテゴリー数をＫ個とすると、入力層の各ユニット
に対応したＮ個の中間ユニットを出力カテゴリーに対応
してＫ組揃えた中間ユニット群及びバイアスを与えるユ
ニットを各中間ユニットに一つずつ付与した総計(Ｎ＋
１)×Ｋ個のユニットを有する中間層よりなり、入力層
と中間層の間の結線の荷重値は、全て１で、各中間ユニ
ットの入出力特性は、出力カテゴリーに属させたい入力
パターン群のＮ個の各要素の代表値とバラツキを示す統
計的な量に基づいて作成された凸のメンバーシップ関数
で表されている。Embodiment 2 This embodiment is characterized in that a second output layer described below is added to the neural network described in Embodiment 1 and provided. Now, as shown in FIG. 3, the relationship between the input layer, the intermediate layer, and the output layer 1 is exactly the same as in the first embodiment, and a detailed description will be omitted. The relationship between the input layer, the intermediate layer, and the output layer 2 will be mainly described. The method of connecting the input layer and the intermediate layer, the method of determining the connection load, and the method of determining the function of the input / output characteristics of the intermediate unit are the same as those in the first embodiment. The number of elements of the input vector is set to N.
Assuming that the number of output categories is K, an intermediate unit group in which N intermediate units corresponding to each unit of the input layer are arranged in K sets corresponding to the output category and a unit for applying a bias are given to each intermediate unit. Total (N +
1) An intermediate layer having × K units, the load values of the connections between the input layer and the intermediate layer are all 1, and the input / output characteristics of each intermediate unit are the input pattern group desired to belong to the output category. Is represented by a convex membership function created based on a statistical value indicating a variation and a representative value of each of the N elements.

【００２４】次に、中間層と出力層２との結線のされ方
は、各出力カテゴリーに対応する中間ユニットと１対１
で結合され、結合の荷重値は１であり、出力層２の各出
力ユニットは、中間層からの出力値であるメンバーシッ
プ値、即ち、入力ベクトルの各要素のメンバーシップ値
の中で最も小さい値を出力するようにする。例えば、出
力カテゴリー１に対応する出力層２の出力ユニットＯ₂₁
へは、対応する中間層のユニットＣ₁とＣ₂とのみ結線さ
れ、バイアスを与えるユニットとは結線されず、中間層
と出力層２との結線の荷重値は１で、出力層２の出力ユ
ニットＯ₂₁は、中間ユニットＣ₁、Ｃ₂の出力する各メン
バーシップ値の最小値を出力する。従って、出力層２か
らの出力値は、統計的に処理された学習パターンの各要
素の平均値から最も大きく離れた要素に対応したメンバ
ーシップ値が出力すれる。即ち、一つでも学習パターン
の要素と大きくずれると、出力値は小さくなるのであ
る。本実施例の特徴は、対応する出力カテゴリーの出力
層１の確からしさを出力層２の出力値で判断することに
ある。Next, the way of connection between the intermediate layer and the output layer 2 is one-to-one with the intermediate unit corresponding to each output category.
And the weight value of the connection is 1, and each output unit of the output layer 2 has the smallest membership value among the elements of the input vector, that is, the membership value that is the output value from the hidden layer. Output a value. For example, the output unit O ₂₁ of the output layer 2 corresponding to the output category 1
Are connected only to the corresponding units C ₁ and C ₂ of the intermediate layer, not to the unit for applying the bias, the load value of the connection between the intermediate layer and the output layer 2 is 1, and the output of the output layer 2 is The unit O ₂₁ outputs the minimum value of each membership value output by the intermediate units C ₁ and C ₂ . Therefore, as an output value from the output layer 2, a membership value corresponding to an element that is most distant from the average value of each element of the statistically processed learning pattern is output. That is, if even one of the elements of the learning pattern greatly deviates, the output value decreases. The feature of this embodiment is that the likelihood of the output layer 1 of the corresponding output category is determined by the output value of the output layer 2.

【００２５】このことを図４を用いて説明する。×印、
○印、△印は、各々カテゴリーＡ、Ｂ、Ｃに属する学習
データを表している。図４の四角で囲まれた領域は、各
カテゴリーに属する７つのデータの平均値と標準偏差に
基づいて作成されたメンバーシップ関数の境界を示して
いて、点線の領域は、説明したように、各カテゴリーに
対応する各中間ユニットの入出力特性を示すメンバーシ
ップ関数（ここでは、正規分布型として考えている）に
おいて、平均値から標準偏差量だけ離れた境界を示し、
一点鎖線は、平均値から３倍の標準偏差量だけ離れた境
界を示している。このように、各中間ユニットの入出力
特性を決定した後に、同じ学習データのセットか、若し
くは、異なる学習データのセットを用いて出力層１と中
間層との間の結線の荷重値をδルールにより決定する点
は、実施例１と同様である。一方、出力層２からの出力
は、入力ベクトルの各要素のメンバーシップ値の中で最
も小さい値を出力する。This will be described with reference to FIG. × mark,
Circles and triangles represent learning data belonging to categories A, B, and C, respectively. 4 indicate the boundaries of the membership function created based on the average value and the standard deviation of seven data belonging to each category, and the dotted-line area indicates, as described, In the membership function indicating the input / output characteristics of each intermediate unit corresponding to each category (here, it is considered as a normal distribution type), a boundary separated from the average by the standard deviation is shown.
The dashed-dotted line indicates a boundary separated from the average by three standard deviations. Thus, after determining the input / output characteristics of each intermediate unit, the load value of the connection between the output layer 1 and the intermediate layer is determined using the same set of learning data or different sets of learning data. Is determined in the same manner as in the first embodiment. On the other hand, the output from the output layer 2 outputs the smallest value among the membership values of the elements of the input vector.

【００２６】図５は、以上の操作を行なった後の出力層
１と出力層２の出力結果を示した表である。データ１か
らデータ３までは、学習に使用した各カテゴリーのデー
タである。例えば、データ１に対して、出力層１の認識
結果の第１位の出力は、カテゴリーＡで、０．９３の値
を示し、第２位の出力は、カテゴリーＣで０．００１の
出力値を示している。一方、出力層２の認識結果の第１
位の出力は、カテゴリーＡで、０．３３の値を示し、第
２位の出力は、カテゴリーＣで０の出力値を示してい
る。このように、学習されたデータに対しては、出力層
１の出力値は非常に高くなり、また出力層２の出力値も
比較的高い値を有する。ところが、出力層１の出力値は
非常に高くなり、また、出力層２の出力値も比較的高い
値を有する。ところが、出力層１の出力は、学習データ
に基づいて、所謂汎化された領域による出力なので、例
えば、データ４やデータ６の出力のように、カテゴリー
Ａである程度高いと判断しているが、統計的な処理によ
る出力層２の出力は、非常に小さい値を示している。従
って、この汎化の程度が正確であるか否かを判断する尺
度として、出力層２からの出力を用いることが本実施例
の特徴である。上記の例では、学習データが十分でない
ために、出力層２の出力が統計的に十分意味を持ってい
ると考え難いが、データ数が十分に多くとれば、統計的
に十分意味を有するようになる。FIG. 5 is a table showing output results of the output layers 1 and 2 after performing the above operations. Data 1 to data 3 are data of each category used for learning. For example, for data 1, the first output of the recognition result of the output layer 1 indicates a value of 0.93 in category A, and the second output indicates an output value of 0.001 in category C. Is shown. On the other hand, the first of the recognition results of the output layer 2
The output of the rank indicates a value of 0.33 in category A, and the output of the second rank indicates an output value of 0 in category C. As described above, the output value of the output layer 1 becomes extremely high with respect to the learned data, and the output value of the output layer 2 also has a relatively high value. However, the output value of the output layer 1 becomes very high, and the output value of the output layer 2 also has a relatively high value. However, since the output of the output layer 1 is an output of a so-called generalized area based on the learning data, for example, it is determined that the output of the output layer 1 is somewhat higher in the category A as in the output of the data 4 and the data 6, The output of the output layer 2 by the statistical processing shows a very small value. Therefore, the feature of the present embodiment is that the output from the output layer 2 is used as a scale for determining whether or not the degree of generalization is accurate. In the above example, it is difficult to consider that the output of the output layer 2 is statistically significant because the learning data is not sufficient. However, if the number of data is sufficiently large, the output may be statistically sufficiently significant. become.

【００２７】即ち、正規分布を仮定すれば、平均値から
標準偏差量だけ離れた領域内にあるデータ数は、全デー
タ数のおよそ６８％ということになり、平均値から標準
偏差の２倍だけ離れた領域内にあるデータ数は、全デー
タ数のおよそ９５％となる。また、平均値からの標準偏
差量だけ離れた点におけるメンバーシップ値は、０．６
０６となり、平均値から標準偏差の３倍だけ離れた点に
おけるメンバーシップ値は、０．１３５となる。従っ
て、認識結果が誤って出力されては困るような応用、例
えば、医療画像の認識や細胞の種類の同定等に使用する
ときには、第２の出力値を重視し、その大きさで第１の
出力層の結果の真なる程度を判断すれば良い。図５を用
いると、出力層２からの出力値が０．１３５よりも大き
いものを判断の閾値とすると、データ４、６の出力層１
の認識結果は、非常に当てにならないと判断できる。ま
た、データ５の出力に対しては、出力層２からの出力結
果としては、まあまあ当てになろうが、出力層１の出力
自体が小さいので、これも認識結果から省くことができ
る。That is, assuming a normal distribution, the number of data in the area separated from the average by the standard deviation amount is about 68% of the total number of data, and is twice the standard deviation from the average. The number of data in the remote area is about 95% of the total number of data. The membership value at a point separated by the standard deviation from the average value is 0.6.
06, and the membership value at a point 3 times the standard deviation from the average value is 0.135. Therefore, when the recognition result is not output erroneously, for example, when it is used for medical image recognition or cell type identification, the second output value is emphasized, and the first output value is used for the size. What is necessary is just to determine the true degree of the result of the output layer. Referring to FIG. 5, if the output value from the output layer 2 is larger than 0.135 as a judgment threshold, the output layer 1 of the data 4 and 6 is output.
Can be determined to be extremely unreliable. For the output of the data 5, the output result from the output layer 2 may be appropriate, but since the output itself of the output layer 1 is small, this can also be omitted from the recognition result.

【００２８】上記の例では、データ数が少ないので、以
上説明した効果が明確に現れないが、入力ベクトルの要
素数が多く、且つ、学習データ数が多くなると、非常に
効果的になる。尚、出力層２からの出力は、対応する出
力カテゴリーに対応した中間ユニットの要素が比較さ
れ、最も小さい値を出力すれば良いので、簡単な比較器
を用いて演算することができるし、コンピュータ内で計
算して出力しても良い。また、出力カテゴリーに対応し
た中間ユニット群の出力であるメンバーシップ値は、０
−１の値を有するので、この出力値を１から引いて、１
−０の値にして良く知られたウィナー(Winner)テイクー
オール(Take-All)型のニューラルネットワークに入力し
て、入力値の最も大きい値、即ち、中間ユニット群の出
力であるメンバーシップ値としては、最も小さい値に対
応する中間ユニットを決定して、その中間ユニットから
出力されているメンバーシップ値そのものを出力するよ
うにしても良い。この場合には、殆ど主要な部分の構成
を全てニューラルネットワークで構成することができる
ので、コンピュータを用いる場合に比べて、装置が簡単
化できる。In the above example, since the number of data is small, the above-described effects are not clearly exhibited. However, when the number of elements of the input vector is large and the number of learning data is large, the effect becomes very effective. The output from the output layer 2 is compared with the elements of the intermediate unit corresponding to the corresponding output category, and it is only necessary to output the smallest value. Therefore, the output can be calculated using a simple comparator. It may be calculated and output. The membership value, which is the output of the intermediate unit group corresponding to the output category, is 0.
Since it has a value of -1, this output value is subtracted from 1 to obtain 1
The value of −0 is input to the well-known Winner Take-All type neural network, and the largest value of the input value, that is, the membership value which is the output of the intermediate unit group is The intermediate unit corresponding to the smallest value may be determined, and the membership value itself output from the intermediate unit may be output. In this case, almost all of the main components can be configured by a neural network, so that the apparatus can be simplified as compared with the case where a computer is used.

【００２９】[0029]

【実施例３】本実施例では、前記の実施例１及び２にお
ける学習済みの各中間ユニットの入出力特性を未知の入
力ベクトルがどのカテゴリーに属するかを判定すること
により、簡単に更新させていくことを特徴とするもので
ある。尚、ニューラルネットワーク自体の構成は、前記
の実施例１、２と同様なので説明を省略する。さて、あ
るカテゴリーに対して、学習パターンがＮ個あったと
し、そのカテゴリーに対応する中間ユニットの入出力特
性の各要素ｉの平均値をＭ_i（Ｎ）、標準偏差量をσ
_i(Ｎ）とある。そこで、ある未知の入力ベクトルＸ_iを
入力した結果、このカテゴリーに分類される程度が極め
て高いと判断されたとする。このとき、このカテゴリー
に対する各要素ｉの平均値と標準偏差量を次の様に更新
する。Embodiment 3 In this embodiment, the input / output characteristics of each of the learned intermediate units in Embodiments 1 and 2 are simply updated by determining to which category the unknown input vector belongs. It is characterized by going. Note that the configuration of the neural network itself is the same as in the first and second embodiments, and a description thereof will be omitted. Now, with respect to a category, and the learning pattern was the N, the average value M _i of each element i of the input and output characteristics of the intermediate unit corresponding to that category (N), the standard deviation σ
_i (N). Therefore, a result of entering some unknown input vector X _i, and the degree that fall into this category is determined to very high. At this time, the average value and standard deviation of each element i for this category are updated as follows.

【００３０】即ち、このデータを含めたＮ＋１個のデー
タに対する平均値Ｍ_i(Ｎ＋１）と標準偏差量σ_i(Ｎ＋
１）は、各々、Ｍ_i(Ｎ＋1）＝(Ｎ×Ｍ_i(Ｎ）＋Ｘ_i）／（Ｎ＋１） σ_i(Ｎ＋1)＝√{(Ｎー1)／Ｎ×σ_i(Ｎ)²＋(Ｎ＋1)／Ｎ²×[Ｘ_iーＭ_i(Ｎ＋1)]²} ・・・・・・・（４）のように、各要素ｉの平均値と標準偏差量が前回の平均
値と標準偏差量並びに、サンプル数が既知であれば簡単
に求めることができるのである。That is, the average value M _i (N + 1) and the standard deviation amount σ _i (N +
1) is M _i (N + 1) = (N × M _i (N) + X _i ) / (N + 1) σ _i (N + 1) = √ {(N−1) / N × σ _i (N) ² + (N + 1) / N ² × [X _i −M _i (N + 1)] ² } (4) As shown in (4), the average value and the standard deviation of each element i are If the deviation amount and the number of samples are known, they can be easily obtained.

【００３１】以下、具体的に図を用いて説明する。ある
未知の入力ベクトルが、例えば、図４に示すデータ４で
与えられたとする。このとき、実施例１の方法では、図
５の出力層１の出力しか得られないので、カテゴリーＡ
の出力値が１．０で、カテゴリーＣの出力値が０．８８
となる。このとき、前記の式（４）を用いて、カテゴリ
ーＡに対応する中間ユニットの入出力特性であるメンバ
ーシップ関数を更新することができる。一方、実施例２
においては、出力層２の出力結果が、カテゴリーＡの出
力値が０．０３であり、カテゴリーＣの出力値が０とな
るので、未知の入力が、判断することができないとし
て、入出力特性の更新をしない。このような更新は学習
されたサンプル数が多ければ多いほど確度が高いものと
なり、更に、出現頻度が高いカテゴリーほど、少しずつ
ではあるが、認識領域が広くなっていく傾向を有する。
また、従来のニューラルネットワークの場合には、荷重
値の更新は簡単でなく、最初から全ての結線を再学習に
より行なわなければならず、非常に煩雑であった。Hereinafter, a specific description will be given with reference to the drawings. It is assumed that a certain unknown input vector is given as, for example, data 4 shown in FIG. At this time, in the method of the first embodiment, only the output of the output layer 1 of FIG.
Is 1.0 and the output value of category C is 0.88
Becomes At this time, the membership function, which is the input / output characteristic of the intermediate unit corresponding to category A, can be updated using equation (4). On the other hand, Example 2
In, the output result of the output layer 2 is such that the output value of the category A is 0.03 and the output value of the category C is 0. Do not update. Such an update has a higher degree of accuracy as the number of learned samples increases, and a category having a higher appearance frequency tends to have a wider recognition area, albeit little by little.
In addition, in the case of the conventional neural network, the updating of the weight value is not easy, and all connections must be performed by re-learning from the beginning, which is very complicated.

【００３２】本発明のニューラルネットワークでは、非
常に簡単な更新手続きにより、実質上荷重値を更新する
ので同じ効果が各中間ユニットの入出力特性を変えるこ
とにより実現できる。更に、自己組織的に更新していく
という特徴を有する、例えば、入力の状態が時間的に少
しづつ変動していく場合などには、従来のニューラルネ
ットワークでは、学習時に変動を考慮した学習データを
用意して、結合荷重値を決定していたが、前記の変動が
不規則であったり、予測できないようなものであったり
すると、学習データを用意することができなく、対応す
ることができなかった。本発明のニューラルネットワー
クによると、このような場合でも、各中間層の入出力特
性の関数を更新していくことができるので、ある程度対
処することができるのである。In the neural network of the present invention, the load value is substantially updated by a very simple updating procedure, so that the same effect can be realized by changing the input / output characteristics of each intermediate unit. Furthermore, it has a feature that it is updated in a self-organizing manner.For example, in the case where the state of an input changes little by little over time, a conventional neural network uses learning data that considers the change during learning. Prepared and determined the connection weight value, but if the fluctuation is irregular or unpredictable, learning data cannot be prepared and cannot be handled Was. According to the neural network of the present invention, even in such a case, since the function of the input / output characteristics of each intermediate layer can be updated, it is possible to cope with it to some extent.

【００３３】[0033]

【実施例４】本実施例では、実施例１〜３における中間
層のバイアスを与えるユニットの個数と、中間層と出力
層の結線のさせ方が異なることを特徴としている。本実
施例において、入力層と中間層の結線のされ方、並びに
結合荷重値の決め方、及び、各中間ユニットの入出力特
性であるメンバーシップ関数の決め方は、実施例１と同
様である。即ち、入力ベクトルの要素数をＮ個とし、出
力カテゴリー数をＫ個とすると、入力層の各ユニットに
対応したＮ個の中間ユニットを出力カテゴリーに対応し
てＫ組揃えた中間ユニット群を用意する。ここで、入力
層と中間層の間の結線の荷重値は、全て１で、各中間ユ
ニットの入出力特性は、出力カテゴリーに属させたい入
力パターン群のＮ個の各要素の代表値とバラツキを示す
統計的な量に基づいて作成された凸のメンバーシップ関
数で表されている。但し、中間ユニット群に付与される
バイアスはただ一つであり、従って、中間ユニット数
は、Ｎ×Ｋ＋１個となる。また、中間層と出力層との間
の結線は、中間層にある全てのユニットと出力層にある
全てのユニットの間でなされる。[Embodiment 4] This embodiment is characterized in that the number of units for applying a bias to the intermediate layer in Embodiments 1 to 3 is different from that of the connection between the intermediate layer and the output layer. In the present embodiment, how to connect the input layer and the intermediate layer, how to determine the coupling load value, and how to determine the membership function that is the input / output characteristic of each intermediate unit are the same as in the first embodiment. That is, assuming that the number of elements of the input vector is N and the number of output categories is K, an intermediate unit group in which K sets of N intermediate units corresponding to each unit of the input layer are prepared corresponding to the output category is prepared. I do. Here, the load values of the connections between the input layer and the intermediate layer are all 1, and the input / output characteristics of each intermediate unit vary from the representative values of the N elements of the input pattern group to be assigned to the output category. Is represented by a convex membership function created based on the statistical quantity However, only one bias is applied to the intermediate unit group, and therefore, the number of intermediate units is N × K + 1. The connection between the intermediate layer and the output layer is made between all units in the intermediate layer and all units in the output layer.

【００３４】以上、説明したことを、図を用いて説明す
る。図６は、本実施例の構成を示す構成図である。先
ず、入力ベクトルは、説明を簡単にするために、２つの
要素からなるものとし、出力カテゴリーの数を３とす
る。先ず、入力層のユニットＩ₁とＩ₂は、各々の中間層
のユニットＣ₁、Ｃ₃、Ｃ₅とＣ₂、Ｃ₄、Ｃ₆にしか結合さ
れておらず、また、中間層のユニットＣ₇は、バイアス
を与えるユニットである。従って、入力層とは結合され
ていない。次に、中間層と出力層のユニット間の結合
は、一つの出力カテゴリーの出力ユニットＯ_１を見る
と、中間層にある全てのユニットＣ₁〜Ｃ₇と結合されて
いる。その他の出力ユニットについても同様である。さ
て、本実施例のニューラルネットワークの学習のさせ方
も、実施例１と同様なので詳細な説明は省略するが、中
間層から出力される各出力カテゴリーに対応した各要素
のメンバーシップ値を入力して、δルールにより学習し
て中間層と出力層との間の結合荷重値を求めることがで
きる。このように、実施例１と異なるのは、中間層と出
力層の間の結線数が増加していることであり、この分、
学習に時間が掛かるが、より緻密な結合荷重の構造がで
きるので、より複雑な認識領域の作成が可能である。
尚、本実施例のニューラルネットワークにおいて、実施
例２と同様に、出力層２を設け、出力層１の結果の真な
る程度を判定したり、実施例３のように、各中間ユニッ
トの入出力特性であるメンバーシップ関数を更新してい
くようにすることができることは、いうまでもない。ま
た、中間層と出力層との間に更に第２の中間層を設け、
従来のＥＢＰにより各層間の結合荷重値を決めることも
できる。このようにすると、更に、複雑な認識領域を形
成することができる。What has been described above will be described with reference to the drawings. FIG. 6 is a configuration diagram illustrating the configuration of the present embodiment. First, it is assumed that the input vector is composed of two elements and the number of output categories is three for simplicity of explanation. First, the units I ₁ and I ₂ of the input layer are connected only to the units C ₁ , C ₃ , C ₅ and C ₂ , C ₄ , C ₆ of the respective intermediate layers. C ₇ is a unit for applying a bias. Therefore, it is not coupled to the input layer. Next, coupling between the intermediate layer of the output layer unit, looking at the output unit O ₁ of one output categories are associated with all of the units C ₁ -C ₇ in the middle layer. The same applies to other output units. The method of learning the neural network according to the present embodiment is the same as that in the first embodiment, and therefore detailed description is omitted. However, the membership value of each element corresponding to each output category output from the hidden layer is input. Thus, the connection weight value between the intermediate layer and the output layer can be obtained by learning according to the δ rule. Thus, the difference from the first embodiment is that the number of connections between the intermediate layer and the output layer is increased.
Although a long time is required for learning, a more precise structure of the connection weight can be formed, so that a more complicated recognition area can be created.
In the neural network of this embodiment, an output layer 2 is provided in the same manner as in the second embodiment, and the true degree of the result of the output layer 1 is determined. It goes without saying that the membership function, which is a characteristic, can be updated. Further, a second intermediate layer is further provided between the intermediate layer and the output layer,
A conventional EBP can determine the coupling load value between the layers. In this way, a more complicated recognition area can be formed.

【００３５】[0035]

【発明の効果】以上説明したように、本発明のニューラ
ルネットワークにより、前記のような効果が得られた。
それらをまとめると、次のような顕著な技術的効果とな
る。即ち、第１に、本発明のニューラルネットワークに
より学習が必ず収束し、また、各中間ユニットの入出力
特性は、統計的な諸量を用いてメンバーシップ関数とし
て表現することができ、中間層と出力層との間の結線
は、δルールで決定することができるので、極めて簡単
になり、中間層ユニットを試行錯誤的に決める必要がな
い。第２に、統計的な諸量を用いて、ニューラルネット
ワークの汎化の真なる程度を判断することができるの
で、正確な判断を必要とする分野の応用に対して非常に
効果的となる。第３に、学習後でも、簡単な手続きによ
り各中間ユニットの入出力特性を更新することができる
ので、入力ベクトルの時間的な変動を吸収したり、出現
頻度の高いものほど認識し易くすることができる。As described above, the above-described effects are obtained by the neural network of the present invention.
Summarizing them has the following remarkable technical effects. That is, first, learning always converges by the neural network of the present invention, and the input / output characteristics of each intermediate unit can be expressed as a membership function using various statistical quantities. The connection to the output layer can be determined by the δ rule, which is extremely simple, and there is no need to determine the intermediate layer unit by trial and error. Secondly, since the true degree of generalization of the neural network can be determined using statistical quantities, it is very effective for applications in fields that require accurate determination. Third, even after learning, the input / output characteristics of each intermediate unit can be updated by a simple procedure, so that the temporal fluctuation of the input vector can be absorbed, and the higher the frequency of appearance, the easier it is to recognize. Can be.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明のニューラルネットワークの１実施例の
構成を示す模式的構成図である。FIG. 1 is a schematic configuration diagram showing a configuration of an embodiment of a neural network according to the present invention.

【図２】従来の代表的３層ニューラルネットワークの構
成を示す構成図である。FIG. 2 is a configuration diagram showing a configuration of a conventional representative three-layer neural network.

【図３】本発明のニューラルネットワークの他の実施例
の構成を示す構成図である。FIG. 3 is a configuration diagram showing a configuration of another embodiment of the neural network of the present invention.

【図４】本発明によるニューラルネットワークの認識領
域を説明する模式的構成図である。FIG. 4 is a schematic configuration diagram illustrating a recognition area of a neural network according to the present invention.

【図５】本発明のニューラルネットワークの他の実施例
で認識した例を示す表である。FIG. 5 is a table showing an example recognized in another embodiment of the neural network of the present invention.

【図６】本発明のニューラルネットワークの更なる他の
実施例を示す構成図である。FIG. 6 is a configuration diagram showing still another embodiment of the neural network of the present invention.

【符号の説明】[Explanation of symbols]

Ｉ₁、Ｉ₂
入力ユニットＣ₁、Ｃ₂、Ｃ₃、Ｃ₄、Ｃ₅、Ｃ₆、Ｃ₇、Ｃ₈、Ｃ₈
中間ユニットＯ₁、Ｏ₂、Ｏ₃
出力ユニットI ₁ , I ₂
Input unit _{_{_{C 1, C 2, C 3}}} , C 4, C 5, C 6, C 7, C 8, C 8
Intermediate units O ₁ , O ₂ , O ₃
Output unit

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平２−189635（ＪＰ，Ａ) 特開平４−127239（ＪＰ，Ａ) 特開平４−23088（ＪＰ，Ａ) 特開平４−76678（ＪＰ，Ａ) 特開平４−92901（ＪＰ，Ａ) 特開平４−536（ＪＰ，Ａ) 特開平２−292602（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06N 3/00 - 3/10 G06F 15/18 G06G 7/60 ＪＩＣＳＴファイル（ＪＯＩＳ)────────────────────────────────────────────────── ─── Continuation of the front page (56) References JP-A-2-189635 (JP, A) JP-A-4-127239 (JP, A) JP-A-4-23088 (JP, A) JP-A-4-23088 76678 (JP, A) JP-A-4-92901 (JP, A) JP-A-4-536 (JP, A) JP-A-2-292602 (JP, A) (58) Fields investigated (Int. Cl. ^7, DB name) G06N 3/00 - 3/10 G06F 15/18 G06G 7/60 JICST file (JOIS)

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】Ｎ次元の入力ベクトルからなる入力パタ
ーンをＫ個の出力カテゴリーに対応付けするニューラル
ネットワークにおいて、少なくてもＮ個の入力ユニットを有する入力層と前記入
力層の各ユニットに対応したＮ個の中間ユニットを前記
出力カテゴリーに対応してＫ組揃えた中間ユニット群及
びバイアスを与えるユニットを有する中間層とＫ個の出
力ユニットを有する第１の出力層とから構成され、前記
入力層の各ユニットと前記中間ユニット群との間の結線
の荷重値は、全て１で、前記中間ユニット群の各中間ユ
ニットの入出力特性は、対応する出力カテゴリーに属す
る入力パターン群のＮ個の各要素毎の代表値とバラツキ
を示す統計的な量に基づき作成された凸のメンバーシッ
プ関数で表され、前記中間層と前記第１の出力層との間
の結線の荷重値は、前記メンバーシップ関数から出力さ
れるメンバーシップ値を前記中間ユニット群の出力とし
て、δルールに基づき学習により決定することを特徴と
する前記ニューラルネットワーク。1. A neural network for associating an input pattern consisting of N-dimensional input vectors with K output categories, comprising: an input layer having at least N input units and a unit corresponding to each unit of the input layer. The input layer, comprising: an intermediate unit group having K units of N intermediate units corresponding to the output categories; an intermediate layer having a bias applying unit; and a first output layer having K output units. The load value of the connection between each unit of the intermediate unit group and the intermediate unit group is all 1, and the input / output characteristics of each intermediate unit of the intermediate unit group are the N input / output characteristics of the input pattern group belonging to the corresponding output category. The intermediate layer and the first output layer are represented by a convex membership function created based on a statistical value indicating a representative value and variation for each element. The neural network load value of the connection between, wherein said an output of said membership values output from the membership function intermediate unit group is determined by learning based on the δ rule.

【請求項２】Ｎ次元の入力ベクトルからなる入力パター
ンをＫ個の出力カテゴリーに対応付けするニューラルネ
ットワークにおいて、少なくてもＮ個の入力ユニットを有する入力層と前記入
力層の各ユニットに対応したＮ個の中間ユニットを前記
出力カテゴリーに対応してＫ組揃えた中間ユニット群及
びバイアスを与えるユニットを有する中間層とＫ個の出
力ユニットを有する第１の出力層と第２の出力層とから
構成され、前記入力層の各ユニットと前記中間ユニット
群との間の結線の荷重値は、全て１で、前記中間ユニッ
ト群の各中間ユニットの入出力特性は、対応する出力カ
テゴリーに属する入力パターン群のＮ個の各要素毎の代
表値とバラツキを示す統計的な量に基づき作成された凸
のメンバーシップ関数で表され、前記中間層と前記第１
の出力層との間の結線の荷重値は、前記メンバーシップ
関数から出力されるメンバーシップ値を前記中間ユニッ
ト群の出力として、δルールに基づき学習により決定さ
れ、前記第２の出力層からはＫ個の出力カテゴリーに対
応する各々の前記中間ユニット内のＮ個の要素が比較さ
れ、最も小さい値が出力カテゴリー毎に出力され、ある
入力に対し、前記第１の出力層が出力するカテゴリーの
真なる程度を、対応する該カテゴリーに対して前記第２
の出力層が出力する最も小さい値の程度で評価すること
を特徴とする前記ニューラルネットワーク。2. A neural network for associating an input pattern consisting of N-dimensional input vectors with K output categories, comprising: an input layer having at least N input units; and a unit corresponding to each unit of the input layer. An intermediate layer having K units of N intermediate units corresponding to the output category and an intermediate layer having a bias applying unit, and a first output layer and a second output layer having K output units. The load value of the connection between each unit of the input layer and the intermediate unit group is all 1, and the input / output characteristics of each intermediate unit of the intermediate unit group are input patterns belonging to the corresponding output category. Represented by a convex membership function created based on a statistical value indicating the variation and the representative value of each of the N elements of the group, the intermediate layer and the First
The load value of the connection between the output layer and the output layer is determined by learning based on the δ rule using the membership value output from the membership function as the output of the intermediate unit group, and is determined from the second output layer. The N elements in each of the intermediate units corresponding to the K output categories are compared, the smallest value is output for each output category, and for a given input, the category of the category output by the first output layer The true degree is determined for the corresponding category by the second
Wherein the evaluation is made based on the smallest value output by the output layer of the neural network.

【請求項３】前記中間ユニット群の各中間ユニットの入
出力特性は、前記第１の出力層或いは前記第２の出力層
により評価された前記第１の出力層の出力する値が最も
大きいカテゴリーに対応する中間ユニットに対し、前記
代表値とバラツキを示す量を更新することにより変化さ
せられることを特徴とする請求項１或いは２に記載のニ
ューラルネットワーク。3. An input / output characteristic of each intermediate unit of the intermediate unit group is a category in which a value output from the first output layer evaluated by the first output layer or the second output layer is the largest. 3. The neural network according to claim 1, wherein the intermediate unit is changed by updating an amount indicating the variation with the representative value for the intermediate unit corresponding to (b).

【請求項４】前記バイアスを与えるユニットは、前記Ｋ
組揃えた中間ユニット群の各組の中間ユニットに一つず
つ付与し、前記中間層と前記第１の出力層との間の結線
は、出力カテゴリーに対応した中間ユニット及び該バイ
アスを与えるユニットと出力カテゴリーに対応する前記
第１の出力層のユニットのみと結合させることを特徴と
する請求項１〜３のいずれかに記載のニューラルネット
ワーク。4. The apparatus according to claim 1, wherein said biasing unit comprises:
The intermediate unit of each set of the arranged intermediate unit group is provided one by one, and the connection between the intermediate layer and the first output layer includes an intermediate unit corresponding to an output category and a unit for providing the bias. The neural network according to any one of claims 1 to 3, wherein the neural network is connected only to a unit of the first output layer corresponding to an output category.

【請求項５】前記バイアスを与えるユニットは、前記中
間ユニット群に１つ付与し、前記中間層と前記第１の出
力層との間の結線は、前記中間層にある全ての中間ユニ
ット及び該バイアスを与えるユニットと出力カテゴリー
に対応する前記第１の出力層のユニットの全てと結合さ
せることを特徴とする請求項１〜３のいずれかに記載の
ニューラルネットワーク。5. The bias applying unit is provided one to the intermediate unit group, and the connection between the intermediate layer and the first output layer is connected to all the intermediate units and the intermediate units in the intermediate layer. The neural network according to any one of claims 1 to 3, wherein the neural network is coupled to all of the units of the first output layer corresponding to a unit for applying a bias and an output category.

【請求項６】前記の対応する出力カテゴリーに属する入
力パターン群のＮ個の各要素の代表値とバラツキを示す
値は、前記代表値として、平均値或いは中央値等の統計
的な量を用い、前記バラツキを示す量として、標準偏差
或いは分散等の統計的な量をを用いることを特徴とする
請求項１〜５のいずれかに記載のニューラルネットワー
ク。6. The representative value and the value indicating the variation of each of the N elements of the input pattern group belonging to the corresponding output category use a statistical amount such as an average value or a median value as the representative value. 6. The neural network according to claim 1, wherein a statistical quantity such as a standard deviation or a variance is used as the quantity indicating the variation.