JPH07113917B2

JPH07113917B2 - Neural network, control method thereof, and arithmetic unit for neural network

Info

Publication number: JPH07113917B2
Application number: JP3080385A
Authority: JP
Inventors: ボーザーバーンハード
Original assignee: AT&T Corp
Current assignee: AT&T Corp
Priority date: 1991-03-20
Filing date: 1991-03-20
Publication date: 1995-12-06
Anticipated expiration: 2010-12-06
Also published as: JPH04311254A

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明はパターン認識に使用され
るニューラル・ネットワークに係り、特にその演算装置
に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a neural network used for pattern recognition, and more particularly to an arithmetic unit therefor.

【０００２】[0002]

【従来の技術】コンピュータに基づく情報の収集、取扱
い、操作、記憶及び伝送が、種々のニューラル・ネット
ワーク・アーキテクチャを用いた適応性学習に基づく演
算システムを発展させ且つその採用を促進して来てい
る。これらの演算システムは、一般的に、自動学習ネッ
トワーク、ニューラル・ネットワーク、階層構造ネット
ワーク、及び大規模並列コンピュータ・ネットワークと
呼ばれている。BACKGROUND OF THE INVENTION Computer-based information collection, handling, manipulation, storage and transmission has developed and promoted adaptive learning-based computing systems using various neural network architectures. There is. These computing systems are commonly referred to as automatic learning networks, neural networks, hierarchical networks, and massively parallel computer networks.

【０００３】これらの演算システムを、特定の画像中の
文字パターンの自動認識、分析、分類等を行なうような
問題を解決するために採用することは、潜在的に有効な
方法であると言うことができる。Adopting these computing systems to solve problems such as automatic recognition, analysis, and classification of character patterns in a particular image is a potentially effective method. You can

【０００４】そのような応用システムの価値を評価する
際には、従来の方法と比較して二つの重要な演算パラメ
ータに焦点を当てる必要が有る。これらのパラメータと
は、速度及び正確さである。In assessing the value of such an application system, it is necessary to focus on two important operational parameters compared to conventional methods. These parameters are speed and accuracy.

【０００５】これらシステムの正確さの改善は、これま
では、アーキテクチャをより複雑なものにしたり、学習
ルーチンをより広範囲に広げたり、中間判定レベル（例
えば、濃度階調符号化）の値をより多くしたりすること
によって行われていた。Improvements in the accuracy of these systems have heretofore led to more complex architectures, broader learning routines, and higher intermediate decision level (eg, grayscale coding) values. It was done by doing a lot.

【０００６】[0006]

【発明が解決しようとする課題】ところが、システムの
正確さをより高めるための改善は、そのシステムの速度
に反対の影響を与える傾向が有る。殆どのシステムは、
汎用或いは専用プロセッサーを用いて実行することに基
礎を置いているので、そのような実行の複雑さ及び厳格
さは、一般にそのプロセッサーに対するプログラム・ス
テップの追加となる。その結果、そのプロセッサーの応
答は、実行されるべき幾つかの追加プログラム・ステッ
プだけ遅くなる。However, improvements to improve the accuracy of a system tend to adversely affect the speed of the system. Most systems
Since being based on execution using a general-purpose or special-purpose processor, the complexity and rigor of such execution is typically an addition of program steps to that processor. As a result, the response of that processor is delayed by some additional program steps to be performed.

【０００７】その結果、より速くより効果的なニューラ
ル・ネットワーク・コンピュータ・システムは、そのよ
うなプロセッサーを、より速く且つ同等若しくはもっと
正確なプロセッサーと置き換えることによってのみ実現
可能であると思われる。本発明は、プロセッサを変更す
ることなく、演算の正確性及び信頼性を維持したまま、
高速の演算を達成可能なニューラル・ネットワークを提
供することを目的とする。As a result, faster and more effective neural network computer systems appear to be feasible only by replacing such processors with faster and equivalent or more accurate processors. The present invention, without changing the processor, while maintaining the accuracy and reliability of the operation,
It is an object of the present invention to provide a neural network capable of achieving high-speed calculation.

【０００８】[0008]

【課題を解決するための手段】本発明では、各ニューロ
ン即ち演算エレメントに対する正確な重みを決定するた
めに演算の複雑な非線形平坦化関数が各ニューロン即ち
演算エレメントに用いられているニューラル・ネットワ
ークのそれら多数のニューロン（演算エレメント）に学
習を行なわせ、この演算の複雑な非線形平坦化関数を演
算の複雑さが少ない非線形平坦化関数と置き換え、この
ニューラル・ネットワークの各ニューロンに用いられて
いる演算の複雑さが少ない非線形平坦化関数を有するニ
ューラル・ネットワークに与えられたデータを分類する
ことにより、ニューラル・ネットワークによって得られ
る結果の質、正確さ、信頼性を犠牲にすること無く演算
の複雑さを低減させることによってより速い演算速度を
達成することができる。SUMMARY OF THE INVENTION In the present invention, a neural network in which a complex non-linear flattening function of operations is used for each neuron or computing element to determine the exact weight for each neuron or computing element. A large number of neurons (arithmetic elements) are made to learn, and the complex non-linear flattening function of this computation is replaced with a non-linear flattening function of less computational complexity, and the computation used in each neuron of this neural network. By classifying the data provided to a neural network with a non-linear flattening function of less complexity, the computational complexity can be obtained without sacrificing the quality, accuracy, or reliability of the results obtained by the neural network. It is possible to achieve faster computing speed by reducing That.

【０００９】また、演算の複雑な非線形平坦化関数を演
算の複雑さが少ない非線形平坦化関数と置き換えること
によって、各ニューロンでニューラル・ネットワークの
ために実現されるべき演算の量を少なくすることが可能
となる。Further, by replacing the non-linear flattening function having a complicated operation with a non-linear flattening function having a low computational complexity, it is possible to reduce the amount of the operation to be realized for the neural network in each neuron. It will be possible.

【００１０】本発明の一実施例においては、双曲正接関
数のような微分可能非線形関数が、区分的線形しきい値
論理関数で置き換えられる。微分可能非線形関数は、逆
伝搬法のような或る周知な学習規準に必要なものであ
る。In one embodiment of the present invention, the hyperbolic tangent function
Differentiable nonlinear functions such as numbers are replaced by piecewise linear threshold logic functions. Differentiable non-linear functions are necessary for some well-known learning criteria such as backpropagation.

【００１１】本発明の他の実施例においては、ニューラ
ル・ネットワークの各ニューロン即ち演算エレメント
が、種々な程度の複雑さを持つ非線形平坦化関数を表わ
す種々の非線形関数エレメントを有する。これらのエレ
メントは、本発明の原理に従う学習演算段階及び分類演
算段階において、動作状態及び非動作状態への切り替え
が制御される。In another embodiment of the invention, each neuron or computational element of the neural network has various non-linear function elements that represent non-linear flattening functions of varying degrees of complexity. These elements are controlled to switch between an operating state and a non-operating state in the learning operation stage and the classification operation stage according to the principles of the present invention.

【００１２】[0012]

【実施例】以下、本発明の一実施例を、パターン認識及
び特に光学的文字認識に応用した場合について説明す
る。このような例は、本発明の内容を説明することを目
的にして提示されるものであって、本発明の範囲をそれ
に限定することを目的として提示されるものではない。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below when it is applied to pattern recognition and especially optical character recognition. Such examples are presented for the purpose of illustrating the contents of the present invention, and not for limiting the scope of the present invention.

【００１３】本発明はまた、ニューラル・ネットワーク
と言われる部門に共通的に含まれる音声処理及び認識、
自動制御システム、及び他の人工知能ネットワーク即ち
自動学習ネットワークのような分野に適用することがで
きる。以下において、「ニューラル・ネットワーク」な
る語は、そのような全てのネットワークを総称して言う
ものとする。The present invention also includes speech processing and recognition, which are commonly included in the department called neural networks.
It can be applied to fields such as automatic control systems and other artificial intelligence networks or automatic learning networks. In the following, the term "neural network" shall collectively refer to all such networks.

【００１４】図１及び図２に示すような演算エレメント
は、殆どのニューラル・ネットワーク即ち学習型ネット
ワークに対する基本関数型コネクショニスト・ブロック
を形成する。一般に、演算エレメントは、単一の値を得
るために、ｎ＋１個の入力信号の入力値の重み付け合計
を生成し、この得られた重み付け合計を非線形関数体ｆ
（α）を通して伝送する。なお、αはこの非線形関数体
の入力値である。The arithmetic elements as shown in FIGS. 1 and 2 form the basic function type connection block for most neural or learning networks. In general, the arithmetic element produces a weighted sum of the input values of the n + 1 input signals and obtains this weighted sum in a non-linear function field f to obtain a single value.
Transmit through (α). Note that α is an input value of this nonlinear function field.

【００１５】この非線形関数はしばしば非線形平坦化関
数と称され、例えば図３乃至図７に示すハード・リミッ
ター関数、しきい値論理素子関数、Ｓ字状関数のような
関数が含まれる。この演算エレメントに対する入力値及
び出力値は、マルチ・レベルや濃度階調のようなアナロ
グ値や疑似アナログ値、或いは本質的な２値とすること
ができる。This non-linear function is often referred to as a non-linear flattening function, and includes functions such as the hard limiter function, the threshold logic element function, and the S-shaped function shown in FIGS. The input value and the output value to this arithmetic element can be an analog value such as multi-level or density gradation, a pseudo analog value, or an essential binary value.

【００１６】演算においては、図１及び図２に示す演算
エレメントは、光学式文字認識のような典型的な演算中
に、画像即ち特徴マップからの互いに隣接する入力画
素、画素値即ちユニット値と考えることができるｎ個の
異なる入力信号を精査する。これらの入力はａ₁、ａ₂、
・・・、ａ_nとして表わされる値を持つ。In operation, the arithmetic elements shown in FIGS. 1 and 2 are used to compute adjacent input pixels, pixel values or unit values from an image or feature map during a typical operation such as optical character recognition. Probing n different input signals that can be considered. These inputs are a ₁ , a ₂ ,
... has a value represented as a _n.

【００１７】入力バイアスは演算エレメントの第ｎ＋１
入力端へ与えられる。説明を簡単にするために、この入
力バイアスは一般に１のような一定値に設定される。The input bias is the n + 1th arithmetic element.
It is given to the input terminal. For ease of explanation, this input bias is typically set to a constant value such as one.

【００１８】これら入力信号と入力バイアスは、乗算器
１−１乃至１−（ｎ＋１）へ与えられる。これらの乗算
器は、他方の入力信号を重みｗ₁乃至ｗ_n+1を有する重み
ベクトルから受け取る。全乗算器からの出力信号は、そ
れら入力値の重み付け合計を発生する加算器２へ与えら
れる。These input signals and input biases are given to the multipliers 1-1 to 1- (n + 1). These multipliers receive the other input signal from a weight vector having weights w _{1 to} w _{n + 1} . The output signals from all multipliers are applied to adder 2 which produces a weighted sum of their input values.

【００１９】このようにして、加算器２からの出力信号
は、単に、それら重みを表わすベクトルを持つそれら入
力値（バイアス値を含む）についての、ベクトルの内積
である。In this way, the output signal from adder 2 is simply the vector dot product for those input values (including bias values) which have vectors representing their weights.

【００２０】加算器２からの出力信号は、非線形関数体
を通して伝送されることにより、一個のユニット出力値
ｘ_iを生成する。以下の記載から更に明確に理解可能と
なるように、ユニット出力値ｘ_iは、考察中の特徴マッ
プの中の第ｉ番目のユニットの値を表している。The output signal from the adder 2 is transmitted through the non-linear function field to generate one unit output value x _i . As will be more clearly understood from the description below, the unit output value x _i represents the value of the i th unit in the feature map under consideration.

【００２１】第一非線形関数体は、重みベクトルの値を
逆伝搬法またはそれと同種のような標準学習アルゴリズ
ムによって確立するために、ニューラル・ネットワーク
の動作のうちの学習段階中で使用される。分類段階また
は認識段階のような非学習段階中では、第二非線形関数
体がそのニューラル・ネットワークによって使用され
る。本実施例では、この第二非線形関数体は第一非線形
関数体よりは演算の複雑さが少なく為されている。The first non-linear function field is used during the learning phase of the neural network's operation to establish the value of the weight vector by a standard learning algorithm such as backpropagation or the like. During the non-learning phase, such as the classification phase or the recognition phase, the second non-linear function field is used by the neural network. In the present embodiment, the second non-linear function field is less complicated than the first non-linear function field.

【００２２】演算の複雑さは、第二非線形関数体におい
てその非線形関数ｆ（α）の結果を得るために必要な演
算が第一非線形関数体と比較してより簡単であり、且
つ、演算の数が少ないことを意味する。このことは、以
下の図３乃至図７の説明によって明らかにされよう。The complexity of the operation is that the operation required to obtain the result of the nonlinear function f (α) in the second nonlinear function field is simpler than that in the first nonlinear function field, and It means a small number. This will be made clear by the description of FIGS. 3 to 7 below.

【００２３】非学習段階の非線形関数に対して、演算の
複雑さがより少ないものを要求することにより、各演算
エレメントが非学習動作段階では、学習動作段階中に第
一非線形関数体を使用する各対応する演算エレメントよ
りも速く演算できることになる。By requesting the non-learning stage non-linear function to have less computational complexity, each computing element uses the first non-linear function field during the non-learning stage, during the non-learning stage. The calculation can be performed faster than each corresponding calculation element.

【００２４】演算の複雑さを理解するには、図３乃至図
７を参照すると良い。図５及び図７は、図３、図４及び
図６に示される区分的線形関数若しくは準連続関数に比
べて高い演算の複雑さを持つ連続非線形関数である。To understand the computational complexity, refer to FIGS. 3-7. 5 and 7, the ratio a piecewise linear function or quasi-continuous function shown in FIGS. 3, 4 and 6
It is a continuous nonlinear function with a high computational complexity.

【００２５】本実施例では、学習段階においては演算の
複雑な非線形関数（例えば、図５及び図７）が各重みベ
クトル・エントリーに値を確立するために使用される。
重みベクトル値が設定された後、演算の複雑さが少ない
非線形関数（例えば、図３、図４及び図６）が演算の複
雑な非線形関数と置き換えて使用される。In this embodiment, a complex non-linear function of operations (eg, FIGS. 5 and 7) is used during the learning phase to establish a value for each weight vector entry.
After the weighting vector values are set, a non-linear function of low computational complexity (eg, FIGS. 3, 4, and 6) is used in place of the non-linear function of computational complexity.

【００２６】図５及び図７に示されるような連続非線形
関数は、所定のαの値に関してｆ（α）を決定するため
に相当な量の演算を必要とする。ゆえに、それらの関数
は演算が複雑である。これらの関数は水平軸に沿う二つ
の漸近線を有する。Continuous non-linear functions such as those shown in FIGS. 5 and 7 require a significant amount of computation to determine f (α) for a given value of α. Therefore, those functions are complicated to operate. These functions have two asymptotes along the horizontal axis.

【００２７】図３、図４及び図６に示されるような演算
の複雑さが少ない非線形関数は、ｆ（α）の値を決定す
るため必要な演算量が相当に少ない。図３に示されるよ
うなハード・リミッター関数は、図７に示される演算が
相当に複雑な非線形関数との近似が極小である。図７の
関数に最も近い近似を有する関数は図６に示され、この
関数は区分的線形しきい値論理関数である。Nonlinear functions with low computational complexity as shown in FIGS. 3, 4 and 6 require a considerably small amount of computation to determine the value of f (α). The hard limiter function as shown in FIG. 3 has a minimal approximation to the nonlinear function whose operation shown in FIG. 7 is considerably complicated. The function with the closest approximation to that of FIG. 7 is shown in FIG. 6, which is a piecewise linear threshold logic function.

【００２８】この関数は、それが幾つかの線形部分によ
って完全な一つの関数を構成するので、区分的線形関数
として知られている。屈折点が縦座標及び横座標に関し
て対称な位置で生じるように示されているが、線の傾斜
が変化したものや、関数曲線の位置が左方或いは右方に
移動したもの等の他の関係もまた本発明で使用するため
の対象とされる。 This function is known as a piecewise linear function because it constitutes a complete function with several linear parts. Although the inflection points are shown to occur at symmetrical positions with respect to the ordinate and the abscissa, other relationships such as a change in the slope of the line, or the position of the function curve moving left or right Are also intended for use in the present invention.

【００２９】図３乃至図７に演算の複雑さが異なる幾つ
かの種々な非線形関数が示されているが、他にも多くの
非線形関数が本発明で使用するための対象とされる。例
えば、特定な非線形平坦化関数のテイラー級数近似を、
演算の複雑な非線形関数として使用することができる。
双曲正接関数の正確なテイラー級数近似は、次の数２で
与えられる。Although various non-linear functions of varying computational complexity are shown in FIGS. 3-7, many other non-linear functions are contemplated for use with the present invention. For example, the Taylor series approximation of a particular nonlinear flattening function is
It can be used as a complex non-linear function of operation.
An exact Taylor series approximation of the hyperbolic tangent function is given by

【００３０】[0030]

【数２】 [Equation 2]

【００３１】なお、ｎは大きな数の整数であり、ａ_iは
定係数である。Note that n is a large number of integers and a _i is a constant coefficient.

【００３２】上記の正確なテイラー級数展開と置換する
ことができる演算の複雑さが軽減された非線形関数は、
次の数３で与えられる。A non-linear function with reduced computational complexity that can replace the exact Taylor series expansion above is
It is given by the following equation 3.

【００３３】[0033]

【数３】 [Equation 3]

【００３４】なお、ｍは小さな数の整数であり、ｍ＜＜
ｎなる関係を有する。Note that m is a small integer, and m <<
have a relationship of n.

【００３５】演算がより複雑な関数ｆ₁（α）に代え
て、演算の複雑さが低くｆ₁（α）のかなり良好な近似
である関数ｆ₂（α）を選択することが望ましい。また
本発明では、それら二つの関数の間に殆ど或いは全く関
係が無くとも、演算の複雑さが低い何らかの関数を演算
の複雑な何らかの関数と置き換えるために使用すること
ができる。Instead of the more complicated function f ₁ (α), it is desirable to select a function f ₂ (α) which has a low calculation complexity and is a fairly good approximation of f ₁ (α). Also, the present invention can be used to replace any function of low computational complexity with any function of computational complexity with little or no relationship between these two functions.

【００３６】図１に示されるように、学習段階において
使用される演算の複雑な非線形関数体３の関数ｆ
₁（α）は、スイッチ５及び６による制御によって演算
エレメントに対して不動作状態に遮断される。このネッ
トワークが非学習段階にあるときは、演算の複雑さが低
い非線形関数体４の関数ｆ₂（α）が演算エレメントに
対して動作状態に接続される。As shown in FIG. 1, the function f of the complex non-linear function field 3 of the operation used in the learning stage
₁ (α) is cut off by the control of the switches 5 and 6 so that the arithmetic element is inoperative. When the network is in the non-learning stage, the function f ₂ (α) of the non-linear function field 4 having a low operation complexity is connected to the operation element in the operating state.

【００３７】図１に示されるスイッチ形態は種々の変形
が可能である。そのような変形には、このニューラル・
ネットワークの特定の動作段階に対して正しい非線形関
数体へそのデータを宛てるために、出力ｘ_iを非線形関
数体３及び４の各々へ接続すること、及びスイッチ５を
使用することが含まれる。The switch form shown in FIG. 1 can be variously modified. For such transformations, this neural
It involves connecting the output x _i to each of the non-linear function fields 3 and 4 and using a switch 5 to direct the data to the correct non-linear function field for the particular stage of operation of the network.

【００３８】他の変形には、前記特定の動作段階に対し
て適切な非線形関数体を選択するために、その入力を非
線形関数体３及び４の各々へ接続すること、及びスイッ
チ６を動作することが含まれる。In another variant, the input is connected to each of the non-linear function bodies 3 and 4 and the switch 6 is operated in order to select the appropriate non-linear function body for the particular stage of operation. Is included.

【００３９】図２には、単一の非線形関数体７が示され
ており、この非線形関数体７の非線形関数ｆ_i（α）
は、或る与えられたαに対して求められる。なお、関数
下付き文字ｉは、演算の複雑な非線形関数に対しては１
であり、演算の複雑さが低い非線形関数に対しては２で
ある。FIG. 2 shows a single non-linear function field 7. The non-linear function f _i (α) of this non-linear function field 7 is shown.
Is calculated for a given α. Note that the function subscript i is 1 for complicated nonlinear functions of operations.
And 2 for non-linear functions with low computational complexity.

【００４０】この図２の構成では、必要とされるエレメ
ントの数が図１の演算エレメントより少ないが、しかし
構成は図１に示される構成とは異種同形なものである。The configuration of FIG. 2 requires fewer elements than the computing elements of FIG. 1, but the configuration is heteromorphic to that shown in FIG.

【００４１】実験した実施例では、図１の演算の複雑な
非線形関数体３に対する典型的なＳ字状関数が、相似双
曲正接関数ｆ_i（α）＝ｃ・ｔａｎｈ（Ｓα）として学
習段階のために選ばれる。なお、αは非線形関数体３へ
の重み付け合計入力であり、ｃはその関数の振幅であ
り、Ｓは座標原点におけるその関数の傾斜である。[0041] In the embodiment described experiment, the typical S-shaped function to complex non-linear function element 3 in operation of FIG. 1, similar twin
The tangent function f _i (α) = c · tanh (Sα) is chosen for the learning stage. Note that α is the weighted sum input to the non-linear function field 3, c is the amplitude of that function, and S is the slope of that function at the coordinate origin.

【００４２】上記で説明され且つ図７に示されているよ
うに、その非線形関数は＋ｃと−ｃとで水平漸近線を持
つ奇関数である。言うまでもなく、奇対称を示す非線形
関数は学習中に重みベクトル成分ｗ₁乃至ｗ_n+1のより速
い収束をもたらすと考えられている。As explained above and shown in FIG. 7, the nonlinear function is an odd function with horizontal asymptotes at + c and -c. Of course, it is believed that non-linear functions exhibiting odd symmetry result in faster convergence of the weight vector components w _{1 to} w _{n + 1} during learning.

【００４３】或る学習ネットワークの各演算エレメント
に対する重みは、標準の学習アルゴリズムを使用して得
られる。そのような学習アルゴリズムの一つに、逆伝搬
法として知られている試行錯誤学習手法が有る。The weights for each computing element of a learning network are obtained using standard learning algorithms. One of such learning algorithms is a trial and error learning method known as a back propagation method.

【００４４】この手法については、ルメルハート氏等の
著書、「並列分散処理：認識のミクロ構造の探求（Ｐａ
ｒａｌｌｅｌＤｉｓｔｒｉｂｕｔｅｄＰｒｏｃｅｓ
ｓｉｎｇ：Ｅｘｐｌｏｒａｔｉｏｎｓｉｎｔｈ
ｅＭｉｃｒｏｓｔｒｕｃｔｕｒｅｏｆＣｏｇｎｉ
ｔｉｏｎ）」巻１の３１９頁乃至３６４頁における、
「誤差伝搬による内的表現の学習（Ｌｅａｒｎｉｎｇ
ＩｎｔｅｒｎａｌＲｅｐｒｅｓｅｎｔａｔｉｏｎｂ
ｙＥｒｒｏｒＰｒｏｐａｇａｔｉｏｎ）」（マサチ
ューセッツ州、ケンブリッジ市、ブラッドフォード・ブ
ックス社発行）を参照することができる。This technique is described in the book by Remelhardt et al., " Parallel Distributed Processing: Exploration of Microstructure of Recognition (Pa
rally Distributed Procedures
sing: Explorations in th
e Microstructure of Cogni
section), pages 319 to 364,
"Learning of internal representation by error propagation (Learning
Internal Representation b
y Error Propagation "(published by Bradford Books, Cambridge, Massachusetts).

【００４５】また、１９８７年発行の「アイ・イー・イ
ー・イーエイ・エス・エス・ピーマガジン」巻４、第
２号の４頁乃至２２頁における、アール・ピー・リップ
マン氏の論文、「ニューラル・ネットワークを用いた演
算（ＣｏｍｐｕｔｉｎｇｗｉｔｈＮｅｕｒａｌＮｅ
ｔｓ）」を参照することができる。[0045] Also, in the article "I-E-E-A-S-SP Magazine", Vol. Computing with Neural Neural Network
ts) ”.

【００４６】自動学習に関するこれら両参照文献の中の
開示事項を、ここに特に参考のために記載する。学習に
先立って、各重みは、例えば−２．４／Ｆ_iから２．４
／Ｆ_iの間での一様分布を用いてランダムな値に初期化
される。なお、Ｆ_iは接続が付随しているユニットの入
力数（ファン・イン）である。図１に示される例に関し
ては、そのファン・インＦ_iはｎ＋１個に等しい。The disclosures in both of these references regarding auto-learning are provided herein for particular reference. Prior to learning, each weight is, for example, from -2.4 / F _i to 2.4.
It is initialized to a random value with a uniform distribution between / F _i . Note that F _i is the number of inputs (fan-in) of the unit to which the connection is attached. For the example shown in FIG. 1, its fan-in F _i is equal to n + 1.

【００４７】この初期化手法を使用することによって、
Ｓ字状非線形関数の動作領域内の値を維持することがで
きる。学習中、画像パターンは一定次数で表されてい
る。重みは、確率勾配処理、即ち単一画像パターンを認
識するために表出した後の「オンライン」処理によって
更新される。By using this initialization technique,
It is possible to maintain the value within the operating region of the S-shaped nonlinear function. During learning, the image pattern is represented by a constant order. The weights are updated by a stochastic gradient process, an "online" process after it has been exposed to recognize a single image pattern.

【００４８】真の勾配処理は、重みが更新される前に平
均化が学習セットの全体に亘って行われるような更新を
行なうために使用することができる。確率勾配は、言う
までもなく、特に大きく冗長な画像データ・ベースに対
する真の勾配よりも速く重みを収束させるために見出だ
される。True gradient processing can be used to perform updates such that averaging is done over the training set before the weights are updated. The stochastic gradient is, of course, found to converge the weights faster than the true gradient, especially for large and redundant image databases.

【００４９】そのことについては、演算エレメント及び
全ニューラル・ネットワークが、ハードウェア或いはソ
フトウェア、若しくはハードウェアとソフトウェアとの
何らかの好都合な組み合わせによって実現することがで
きる。To that end, the computing elements and all neural networks can be implemented in hardware or software, or any convenient combination of hardware and software.

【００５０】ここに呈示されているニューラル・ネット
ワークのほとんどは、加算、減算、乗算及び除算の初等
数学演算を実行する、エイ・ティ・アンド・ティ（ＡＴ
＆Ｔ）社のＤＳＰ・３２Ｃディジタル信号プロセッサー
を使用して実現されている。Most of the neural networks presented here perform AT & T (AT), which performs elementary mathematical operations of addition, subtraction, multiplication and division.
& T) DSP 32C digital signal processor.

【００５１】パイプライン化装置、マイクロ・プロセッ
サー及び専用ディジタル信号プロセッサーもまた、本発
明によるニューラル・ネットワークを実現するために好
都合なアーキテクチャを提供するものである。The pipeliner, the microprocessor and the dedicated digital signal processor also provide a convenient architecture for implementing the neural network according to the invention.

【００５２】ＭＯＳ・ＶＬＳＩ技術もまた、図８に示さ
れている種類の特定の重み付けが為された相互接続ネッ
トワークを実現するために使用されている。ローカル・
メモリーを、画素及びユニット値、並びに他の一時的演
算結果を保存するために使用することが望ましい。MOS VLSI technology has also been used to implement specific weighted interconnection networks of the type shown in FIG. local·
It is desirable to use memory to store pixel and unit values as well as other temporary operation results.

【００５３】各画素は、それと関連し、可視的文字画像
上においてこの画素の小領域から発している輝度や色度
或いは同種なものと対応する値を有する。次いで、これ
ら画素の値はメモリー装置に記憶される。Each pixel has a value associated with it that corresponds to the luminance, chromaticity, or the like, emanating from the small area of this pixel on the visible character image. The values of these pixels are then stored in the memory device.

【００５４】特定のマップについて言えば、「画素」及
び「ユニット値」なる用語は互換的に使用され、これら
にはマップ配列を形成するために連合している各演算エ
レメントから出力される、画素、画素値及びユニット値
が含まれている。For a particular map, the terms "pixel" and "unit value" are used interchangeably and refer to the pixel output from each associated computing element to form a map array. , Pixel value and unit value are included.

【００５５】ニューラル・ネットワークの動作を可視化
し、且つその理解を啓発するためには、画素値即ちユニ
ット値よりもむしろ、画素の平面配列即ち二次元配列に
関して検討することがより好都合であろう。In order to visualize the behavior of the neural network and promote its understanding, it may be more convenient to consider in terms of a planar or two-dimensional array of pixels rather than pixel or unit values.

【００５６】手書き文字を、その与えられた文字画像を
形成する画素配列に変換するためには、標準的な手法が
使用される。この手書き文字画像は遠隔位置からの電子
伝送によって得ることができるし、またこれは局所毎に
走査カメラ或いは他の走査装置を用いて得ることもでき
る。Standard techniques are used to convert a handwritten character into a pixel array forming the given character image. This handwritten image can be obtained by electronic transmission from a remote location, or it can be obtained locally using a scanning camera or other scanning device.

【００５７】その画像源に拘らず且つ通例の手法によ
り、この文字画像は整列した画素集合によって表示され
る。この整列した画素集合は、典型的には配列である。
一旦、表示が為されれば、この文字画像は一般的には捕
捉され、フレーム・バッファーのような光学メモリー装
置或いは電子メモリー装置に記憶される。Regardless of the source of the image and in the usual manner, this character image is displayed by an aligned set of pixels. This aligned set of pixels is typically an array.
Once displayed, this character image is typically captured and stored in an optical or electronic memory device such as a frame buffer.

【００５８】文字認識のための、文字画像を画素配列と
して処理するために用いられている種々の他の予備処理
技法には、全てがこの分野に技術者に周知な、拡大縮
小、サイズの正規化、スキュー除去、センタリング、移
行等の種々の変形が含まれている。Various other pre-processing techniques used to process character images as pixel arrays for character recognition include scaling, size normalization, all well known to those skilled in the art. It includes various modifications such as conversion, skew removal, centering, and migration.

【００５９】更に、手書き文字から濃度階調画素配列へ
の変換が、もしそうしない場合には予備処理中に取り返
しのできない状態に消失されることとなるであろう情報
を保護するために望ましいものであろう。この後者の変
換は、言うまでもなく、この分野に技術者に周知な技法
である。In addition, the conversion of handwritten characters to gray scale pixel arrays is desirable to protect information that would otherwise be irreversibly lost during preprocessing. Will. This latter transformation is, of course, a technique well known to those skilled in the art.

【００６０】一般的には、画像を文字認識のために予備
処理する上記に掲げられた諸演算に加えて、元の画像の
周囲に一様な実質的に一定レベルの境界を設けることが
望ましい。In general, in addition to the operations listed above for preprocessing an image for character recognition, it is desirable to provide a uniform, substantially constant level boundary around the original image. .

【００６１】図８は、本発明が適用される階層構造に為
された自動学習ネットワークの実施例の簡略化されたブ
ロック・ダイヤグラムを示す。このネットワークは、米
国特許出願第４４４，４５５号（１９８９年１１月３０
日付出願）の明細書に記載されている。FIG. 8 shows a simplified block diagram of an embodiment of an auto-learning network made into a hierarchical structure to which the present invention is applied. This network is described in US patent application Ser. No. 444,455 (November 30, 1989).
Date application).

【００６２】上記出願中の開示事項を、ここに特に参考
のために記載する。このネットワークは、与えられた画
像から大規模並列演算によって文字認識を実行する。レ
ベル２０乃至６０に示されている各配列（ボックス）
は、配列単位当たり、多数の演算エレメントを有する。The disclosures of the above-referenced applications are hereby specifically incorporated by reference. This network executes character recognition from a given image by massively parallel computation. Each array (box) shown in levels 20-60
Has a large number of arithmetic elements per array unit.

【００６３】図８に示されるネットワークは、第一と第
二の特徴検出段及び文字分類段を有する。各段は一つ以
上の規模が異なる特徴マップ即ち配列を有する。通例の
応用例のほとんどでは、それらのマップは正方形であ
る。しかしながら、長方形や他の対称及び非対称即ち不
整形なマップ・パターンも対象とされる。The network shown in FIG. 8 has a first and a second feature detection stage and a character classification stage. Each stage has one or more differently scaled feature maps or arrays. In most common applications, the maps are squares. However, rectangles and other symmetrical and asymmetrical or irregular map patterns are also contemplated.

【００６４】検出された特徴の構成は、或る配列が各画
素（各ユニット値）を記憶するメモリー装置内で構成さ
れ、且つ、一つの低レベルのマップからの特徴検出がそ
のマップに対する配列内の適当な位置に配置されるの
で、マップと呼ばれている。従って、このようなマップ
として、或る特徴の存在若しくは実質的な存在及びその
相対位置が記憶される。The configuration of detected features is such that an array is constructed in a memory device that stores each pixel (each unit value), and feature detection from one low level map is in the array for that map. It is called a map because it is placed at an appropriate position. Therefore, as such a map, the existence or substantial existence of a certain feature and its relative position are stored.

【００６５】或るマップ内で検出された特徴の種類は、
使用されている重みベクトルによって決定される。束縛
された特徴マップ内では、同一の重みベクトルがその同
じマップ内の各ユニットに対して用いられる。即ち、束
縛された特徴マップが、一つの特定重みベクトルによっ
て限定された特定の特徴を生じさせるために、画素配列
を走査する。The types of features detected in a map are:
Determined by the weight vector used. Within a bound feature map, the same weight vector is used for each unit within the same map. That is, the constrained feature map scans the array of pixels to produce a particular feature defined by one particular weight vector.

【００６６】このようなマップとして、「束縛された」
なる用語は、特定のマップを有する演算エレメントが同
じセットの重みベクトルを余儀無く共有している状態を
表わすものと解釈される。この結果、入力画像中の別の
位置で同一の特徴が検出されることとなる。この技法は
また、言うまでもなく、重み共有技法として周知なもの
である。As such a map, "bound"
The term is taken to mean that the computing elements with a particular map are forced to share the same set of weight vectors. As a result, the same feature will be detected at another position in the input image. This technique is, of course, also known as the weight sharing technique.

【００６７】この分野の技術者には、重みベクトルが、
この重みベクトルによって規定される特徴を生じさせる
ために、検出されるべき画像の画素面即ちマップ・ユニ
ット面上に受容フィールド（５画素×５画素、或いは２
画素×２画素）を規定することは理解できることであろ
う。For engineers in this field, the weight vector is
To give rise to the features defined by this weight vector, a receptive field (5 pixels x 5 pixels, or 2 pixels) on the pixel plane or map unit plane of the image to be detected.
It will be understood that defining (pixels x 2 pixels).

【００６８】この重みベクトルを或る画素配列に与える
ことによって、どれだけの画素がその特徴マップ内の演
算エレメントに入力されているか、及びその特徴マップ
上のどのユニットが活性化されているかを示すことが可
能である。この活性化されているユニットは、一般的に
は、検出が為されているマップ内に生じている特徴のお
およその位置に該当する。By giving this weight vector to a pixel array, it is shown how many pixels are input to the arithmetic elements in the feature map and which unit on the feature map is activated. It is possible. This activated unit generally corresponds to the approximate location of the occurring feature in the map for which detection is being made.

【００６９】第一の特徴検出段には、多数の束縛された
特徴マップ・レベル２０及びそれと対応する数の特徴低
減マップ・レベル３０が含まれている。図８に示されて
いるように、このネットワークは、この第一特徴検出段
に四個の束縛された特徴マップ２０１〜２０４及びそれ
らと同数の特徴低減マップ３０１〜３０４を有してい
る。The first feature detection stage includes a number of constrained feature map levels 20 and a corresponding number of feature reduction map levels 30. As shown in FIG. 8, the network has four constrained feature maps 201-204 and as many feature reduction maps 301-304 in this first feature detection stage.

【００７０】第二の特徴検出段には、多数の束縛された
特徴マップ・レベル４０及びそれらと対応する数の特徴
低減マップ・レベル５０が含まれている。この図面に示
されているように、このネットワークは、この第二特徴
検出段に十二個の束縛された特徴マップ４０１〜４１２
及びそれらと同数の特徴低減マップ５０１〜５１２を有
している。The second feature detection stage includes a number of constrained feature map levels 40 and a corresponding number of feature reduction map levels 50. As shown in this figure, the network includes twelve bound feature maps 401-412 in this second feature detection stage.
And the same number of feature reduction maps 501 to 512.

【００７１】このネットワークの最後の段は、第二特徴
検出段の全特徴低減マップ５０１〜５１２と完全に接続
されている特徴分類段６０である。この特徴分類段６０
は、与えられた元の画像からこのネットワークにより認
識された文字の特徴を生成する。The last stage of this network is the feature classification stage 60, which is fully connected to the full feature reduction maps 501-512 of the second feature detection stage. This feature classification stage 60
Generates the character features recognized by this network from the given original image.

【００７２】「完全に接続されている」なる語は、特徴
分類段６０内にある画素に関連する演算エレメントが、
その入力を先行するマップの段即ち段５０に含まれてい
るあらゆる画素即ちユニットから受けることを意味す
る。The term "fully connected" means that the computing element associated with a pixel in the feature classification stage 60 is
Means to receive its input from every pixel or unit contained in the preceding map column or column 50.

【００７３】このネットワークにおける段から段への相
互接続ラインは、先行する段内のどのマップが、それら
のユニットが対象とされる段内にそれらマップを形成す
る各演算エレメント及びあらゆる演算エレメントへ入力
を供給するかを示すために、描かれている。The stage-to-stage interconnection lines in this network are input to each and every computing element which maps in the preceding stage form those maps in the stage to which those units are targeted. Is drawn to show what to supply.

【００７４】例えば、束縛された特徴マップ２０１乃至
２０４は、これら束縛された特徴マップを生成するプロ
セス中に、画像１０から個々の特徴を検出する。For example, constrained feature maps 201-204 detect individual features from image 10 during the process of generating these constrained feature maps.

【００７５】次のレベルのマップに進むと、特徴低減マ
ップ３０１は、その入力を束縛された特徴マップ２０１
内のユニットのみから得ていることがわかる。同様に、
特徴低減マップ３０２乃至３０４は、それらの入力をそ
れぞれ束縛された特徴マップ２０２乃至２０４内のユニ
ットのみから得ている。Proceeding to the next level map, the feature reduction map 301 will have its input bound feature map 201.
It turns out that it is obtained only from the unit inside. Similarly,
The feature reduction maps 302-304 obtain their inputs only from the units in the bound feature maps 202-204, respectively.

【００７６】図８に示されるネットワークおいては、第
一特徴検出段から第二特徴検出段への相互接続は幾分か
更に複雑になっている。束縛された特徴マップ４０１、
４０４、４０７及び４１０は、それらの入力をそれぞれ
特徴低減マップ３０１乃至３０４内のユニットのみから
得ている。In the network shown in FIG. 8, the interconnection from the first feature detection stage to the second feature detection stage is somewhat more complicated. Bound feature map 401,
404, 407, and 410 obtain their inputs only from the units in the feature reduction maps 301-304, respectively.

【００７７】束縛された特徴マップ４０２、４０３、４
０５及び４０６は、それらの入力を特徴低減マップ３０
１及び３０２からのユニットの組みから得ている。束縛
された特徴マップ４０８、４０９、４１１及び４１２
は、それらの入力を特徴低減マップ３０３及び３０４か
らのユニットの組みから得ている。Bound feature maps 402, 403, 4
05 and 406 convert their inputs to the feature reduction map 30.
Obtained from the set of units from 1 and 302. Constrained feature maps 408, 409, 411 and 412
Obtains their inputs from the set of units from the feature reduction maps 303 and 304.

【００７８】最後に、個々の特徴低減マップ５０１乃至
５１２は、それらの入力をそれぞれ対応する個々の束縛
された特徴マップ４０１乃至４１２内のユニットのみか
ら得ている。ここで、特徴分類段６０には、このネット
ワークによって解決されるべき特定の文字認識の問題に
対して十分な数のエレメントが含まれていることに留意
すべきである。Finally, the individual feature reduction maps 501-512 derive their inputs only from the units in the respective individual bound feature maps 401-412. It should be noted here that the feature classification stage 60 contains a sufficient number of elements for the particular character recognition problem to be solved by this network.

【００７９】即ち、大文字或いは小文字のラテン語アル
ファベット文字の何れかを認識するためには、特徴分類
段６０には、文字Ａ乃至Ｚ或いはａ乃至ｚの何れかを表
わす二十六個のユニットが含まれることとなろう。他
方、数字を認識するためには、特徴分類段６０には、数
０乃至９を表わす単に十個のユニットが含まれることと
なろう。That is, in order to recognize either uppercase or lowercase Latin alphabetic characters, the feature classification stage 60 includes twenty-six units representing any of the characters AZ or az. Will be done. On the other hand, in order to recognize numbers, the feature classification stage 60 would include only ten units representing the numbers 0-9.

【００８０】説明の便宜上、図１及び図２に示される演
算エレメントへのバイアス入力及びそれと関連する重み
ベクトル・エレメント内の重みは、ここではニューラル
・ネットワークの記述から省略されている。実験した実
施例では、バイアスが１に設定され、その対応する重み
は逆伝搬法によって学習された。For convenience of explanation, the bias inputs to the arithmetic elements shown in FIGS. 1 and 2 and the associated weights in the weight vector elements are omitted here from the description of the neural network. In the experimental example, the bias was set to 1 and its corresponding weight was learned by the backpropagation method.

【００８１】本実施例による演算エレメントが、図８に
示されるニューラル・ネットワークに使用され、且つＡ
Ｔ＆Ｔ社のＤＳＰ・３２Ｃディジタル信号プロセッサー
にプログラムされると、このニューラル・ネットワーク
は文字認識の動作速度が１００％の増加を達成した。The arithmetic element according to this embodiment is used in the neural network shown in FIG.
When programmed into the T & T DSP 32C digital signal processor, this neural network achieved a 100% increase in character recognition speed.

【００８２】このニューラル・ネットワークは、逆伝搬
法学習を有する双曲正接関数を使用して学習を行なっ
た。この双曲正接関数は、分類（文字認識）段階の間、
区分的線形関数によって置換された。この区分的線形関
数は、図６に示されており、次の数４で表わされる。な
お、αは演算エレメント内の非線形関数体への入力であ
る。This neural network was trained using a hyperbolic tangent function with backpropagation learning. This hyperbolic tangent function , during the classification (character recognition) stage,
It was replaced by a piecewise linear function. This piecewise linear function is shown in FIG. 6 and is represented by the following equation 4. Note that α is an input to the non-linear function field in the arithmetic element.

【００８３】[0083]

【数４】 [Equation 4]

【発明の効果】以上説明したように、本発明によれば、
プロセッサを変更することなく、演算の正確性及び信頼
性を維持しつつ、演算の高速化を達成可能なニューラル
・ネットワークを提供することができる。As described above, according to the present invention,
It is possible to provide a neural network capable of achieving high-speed operation while maintaining the accuracy and reliability of the operation without changing the processor.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の一実施例による学習ネットワーク即ち
ニューラル・ネットワークの演算エレメントを示すブロ
ック図である。FIG. 1 is a block diagram showing computing elements of a learning network or neural network according to an embodiment of the present invention.

【図２】図１に示す学習ネットワーク即ちニューラル・
ネットワークの演算エレメントと互換的に使用される演
算エレメントを簡略化して示すブロック図である。FIG. 2 is the learning network or neural network shown in FIG.
It is a block diagram which simplifies and shows a computing element used interchangeably with a computing element of a network.

【図３】図１及び図２の演算エレメントによって使用さ
れる一つの典型的な非線形関数を表わす図である。FIG. 3 is a diagram representing one exemplary non-linear function used by the computing elements of FIGS.

【図４】図１及び図２の演算エレメントによって使用さ
れる別の典型的な非線形関数を表わす図である。FIG. 4 is a diagram representing another exemplary nonlinear function used by the computing elements of FIGS. 1 and 2.

【図５】図１及び図２の演算エレメントによって使用さ
れる更に別の典型的な非線形関数を表わす図である。5 is a diagram representing yet another exemplary non-linear function used by the computing elements of FIGS. 1 and 2. FIG.

【図６】図１及び図２の演算エレメントによって使用さ
れる更に別の典型的な非線形関数を表わす図である。6 is a diagram representing yet another exemplary non-linear function used by the computing elements of FIGS. 1 and 2. FIG.

【図７】図１及び図２の演算エレメントによって使用さ
れる更に別の典型的な非線形関数を表わす図である。7 is a diagram representing yet another exemplary non-linear function used by the computing elements of FIGS. 1 and 2. FIG.

【図８】本発明が適用される典型的な階層構造の自動学
習ネットワークの一実施例を示すブロック図であるFIG. 8 is a block diagram showing an embodiment of a typical hierarchical automatic learning network to which the present invention is applied.

【符号の説明】[Explanation of symbols]

１−１乃至１−（ｎ＋１）乗算器２加算器３第一非線形関数体４第二非線形関数体５スイッチ６スイッチ７単一非線形関数体１０画像２０第一特徴検出段の束縛された特徴マップ・レベ
ル２０１乃至２０４束縛された特徴マップ３０第一特徴検出段の特徴低減マップ・レベル３０１乃至３０４特徴低減マップ４０第二特徴検出段の束縛された特徴マップ・レベ
ル４０１乃至４１２束縛された特徴マップ５０第二特徴検出段の特徴低減マップ・レベル５０１乃至５１２特徴低減マップ６０特徴分類段1-1 to 1- (n + 1) multiplier 2 adder 3 first nonlinear function body 4 second nonlinear function body 5 switch 6 switch 7 single nonlinear function body 10 image 20 bound feature map of the first feature detection stage Level 201 to 204 constrained feature map 30 first feature detection stage feature reduction map level 301 to 304 feature reduction map 40 second feature detection stage constrained feature map level 401 to 412 constrained feature map 50 Feature Reduction Map Levels of Second Feature Detection Stage 501-512 Feature Reduction Map 60 Feature Classification Stage

Claims

【特許請求の範囲】[Claims]

【請求項１】動作モードとして学習モードおよび認識
モードを有するニューラル・ネットワーク用の演算装置
において、データ入力ベクトルと、前記学習モード中に第１非線形
関数によって決定される重みベクトルとに応答して、そ
れらのベクトルの内積を演算する内積演算手段と、前記演算装置の出力値を生成するために、前記内積演算
手段からの出力に応答して、前記学習モード中は前記第
１非線形関数である所定の非線形関数に従って前記出力
を平坦化する平坦化手段と、前記ニューラル・ネットワークが認識モードに入ったこ
とに応答して、前記平坦化手段内の所定の非線形関数と
して、前記第１非線形関数の代わりに、前記第１非線形
関数より計算量の少ない第２非線形関数を使用する切換
手段とからなることを特徴とするニューラル・ネットワ
ーク用演算装置。1. A learning mode and recognition as operation modes
Arithmetic device for neural network with mode
At the data input vector and the first non-linearity during the learning mode
In response to the weight vector determined by the function, its
And inner product calculation means for calculating the inner product of these vectors, to produce an output value of the arithmetic unit, in response to an output from the inner product calculating means, in the learning mode wherein the
Flattening means for flattening the output according to a predetermined non-linear function which is one non-linear function; and the neural network entering a recognition mode.
In response to the predetermined non-linear function in the flattening means, instead of the first non-linear function, the first non-linear function
Switching using a second non-linear function that requires less computation than a function
Computing device for neural network, characterized by comprising a means.

【請求項２】前記第２非線形関数は、前記第１非線形
関数の区分的線形近似であることを特徴とする請求項１
の装置。2. The second non-linear function is a piecewise linear approximation of the first non-linear function.
Equipment .

【請求項３】前記第１非線形関数は、双曲正接関数で
あることを特徴とする請求項２の装置。Wherein said first non-linear function, according to claim 2, characterized in that a hyperbolic tangent function.

【請求項４】前記第２非線形関数は、ｘを前記内積演
算手段の出力として、次の数１で定義される区分的線形
関数であることを特徴とする請求項３の装置。【数１】 Wherein said second non-linear function, as the output of the inner product operation means x, apparatus according to claim 3, characterized in that the piecewise linear function defined by the following equation 1. [Equation 1]

【請求項５】所定の階層構造に相互接続された複数の
演算装置からなり、動作モードとして学習モードおよび
認識モードを有するニューラル・ネットワークにおい
て、各演算装置が、データ入力ベクトルと、前記学習モード中に第１非線形
関数によって決定される重みベクトルとに応答して、そ
れらのベクトルの内積を演算する内積演算手段と、前記演算装置の出力値を生成するために、前記内積演算
手段からの出力に応答して、前記学習モード中は前記第
１非線形関数である所定の非線形関数に従って前記出力
を平坦化する平坦化手段と、前記ニューラル・ネットワークが認識モードに入ったこ
とに応答して、前記平坦化手段内の所定の非線形関数と
して、前記第１非線形関数の代わりに、前記第１非線形
関数より計算量の少ない第２非線形関数を使用する切換
手段とからなることを特徴とするニューラル・ネットワ
ーク。5. A plurality of interconnected layers in a predetermined hierarchical structure
It consists of a computing device, and the learning mode and
Neural network smell with a recognition mode
And each arithmetic unit has a data input vector and a first non-linearity during the learning mode.
In response to the weight vector determined by the function, its
And inner product calculation means for calculating the inner product of these vectors, to produce an output value of the arithmetic unit, in response to an output from the inner product calculating means, in the learning mode wherein the
Flattening means for flattening the output according to a predetermined non-linear function which is one non-linear function; and the neural network entering a recognition mode.
In response to the predetermined non-linear function in the flattening means, instead of the first non-linear function, the first non-linear function
Switching using a second non-linear function that requires less computation than a function
A neural network characterized by comprising means and means .

【請求項６】前記第２非線形関数は、前記第１非線形
関数の区分的線形近似であることを特徴とする請求項５
のニューラル・ネットワーク。6. The second non-linear function is a piecewise linear approximation of the first non-linear function.
Of the neural network.

【請求項７】前記第１非線形関数は、双曲正接関数で
あることをことを特徴とする請求項６のニューラル・ネ
ットワーク。7. The neural network according to claim 6 , wherein the first nonlinear function is a hyperbolic tangent function .

【請求項８】前記第２非線形関数は、ｘを前記内積演
算手段の出力として、次の数１で定義される区分的線形
関数であることを特徴とする請求項７のニューラル・ネ
ットワーク。（数１）8. The neural network according to claim 7 , wherein the second non-linear function is a piecewise linear function defined by the following equation 1 using x as an output of the inner product calculating means. (Equation 1)

【請求項９】データ入力値と、学習モード中に第１非
線形関数によって決定される重みのセットとに応答し
て、出力値を生成するために、学習モード中は第１非線
形関数である所定の非線形関数に従って前記データ入力
値を平坦化する平坦化手段を有する複数の演算装置から
なり、動作モードとして学習モードおよび認識モードを
有するニューラル・ネットワークにおいて、前記演算装置において第１非線形関数を使用して所定の
学習アルゴリズムに従って前記ニューラル・ネットワー
クを学習させるステップと、前記ニューラル・ネットワークが認識モードで動作する
ときに、前記平坦化手段内の所定の非線形関数として、
第１非線形関数の代わりに、前記第１非線形関数より計
算量の少ない第２非線形関数を使用するステップとから
なることを特徴とするニューラル・ネットワークの制御
方法。9. The data input value and the first non-value during the learning mode.
In response to a set of weights determined by a linear function, a first nonlinear line during a learning mode to produce an output value.
The data input according to a predetermined non-linear function which is a form function
From a plurality of arithmetic units having flattening means for flattening the value
The learning mode and the recognition mode
A neural network having a predetermined non-linear
The neural network according to a learning algorithm
The learning process and the neural network operates in recognition mode.
Sometimes, as the predetermined non-linear function in the flattening means ,
Instead of the first non-linear function,
From the step of using the second non-linear function with less complexity
A method for controlling a neural network, characterized by:

【請求項１０】前記第２非線形関数は、前記第１非線
形関数の区分的線形近似であることを特徴とする請求項
９の方法。10. The method of claim 9 , wherein the second non-linear function is a piecewise linear approximation of the first non-linear function.

【請求項１１】前記第１非線形関数は、双曲正接関数
であることをことを特徴とする請求項１０の方法。11. The method of claim 10 , wherein the first non-linear function is a hyperbolic tangent function .

【請求項１２】前記第２非線形関数は、ｘを前記平坦
化手段の入力として、次の数１で定義される区分的線形
関数であることを特徴とする請求項１１の方法。（数１）Wherein said second non-linear function as an input of said flattening means x, method of claim 11, which is a piecewise linear function defined by the following equation 1. (Equation 1)

【請求項１３】前記第１非線形関数は、微分可能関数
であることを特徴とする請求項９の方法。13. The method of claim 9, wherein the first non-linear function is a differentiable function .

【請求項１４】前記所定の学習アルゴリズムは、逆伝
搬法学習アルゴリズムであることを特徴とする請求項１
３の方法。14. The method of claim 13, wherein the predetermined learning algorithm, according to claim 1, characterized in that the backpropagation learning algorithm
Method 3

【請求項１５】前記第２非線形関数は、前記第１非線
形関数の区分的線形近似であることを特徴とする請求項
１４の方法。15. The second non-linear function is a piecewise linear approximation of the first non-linear function.
14 ways.

【請求項１６】前記第１非線形関数は、双曲正接関数
であることを特徴とする請求項１５の方法。16. The method of claim 15, wherein the first non-linear function is a hyperbolic tangent function .

【請求項１７】前記第２非線形関数は、ｘを前記平坦
化手段の入力として、次の数１で定義される区分的線形
関数であることを特徴とする請求項１６の方法。（数１）17. The method according to claim 16, wherein the second non-linear function is a piecewise linear function defined by the following equation 1 with x as an input of the flattening means. (Equation 1)