KR0141341B1

KR0141341B1 - Error signal generation method for efficient learning of multilayer perceptron neural network

Info

Publication number: KR0141341B1
Application number: KR1019940025170A
Authority: KR
Inventors: 오상훈
Original assignee: 양승택; 재단법인한국전자통신연구원
Priority date: 1994-09-30
Filing date: 1994-09-30
Publication date: 1998-07-01
Also published as: JP2607351B2; JPH08115310A; KR960012131A

Abstract

다층퍼셉트론의 역전파학습 알고리즘 성능을 개선하기 위하여, 새로운 오차함수를 제안한다.To improve the performance of the backpropagation learning algorithm of the multilayer perceptron, a new error function is proposed.

제안된 오차함수는 출력층의 목표값이 출력값과의 차이가 많이 날수록 오차신호를 크게 발생시켜, 신경망이 학습과정에서 출력노드가 부적절하게 포화되는 현상을 보여준다.The proposed error function shows that the error signal increases as the target value of the output layer differs from the output value, and the output node is inappropriately saturated in the learning process.

또한, 출력층의 목표값이 출력값과 가까워지면 오차신호가 작게 발생하여 신경망이 학습패턴에 과도하게 학습되는 것을 막아준다.In addition, when the target value of the output layer is close to the output value, an error signal is generated small, thereby preventing the neural network from learning excessively on the learning pattern.

Description

다층퍼셉트론 신경회로망의 효율적인 학습을 위한 오차신호 발생방법(Method for Generation Error Signal in Multi-Layer Percepron Neural Networks)An Error Signal Generation Method for Efficient Learning of Multilayer Perceptron Neural Network (Multi-Layer Percepron Neural Networks)

제1도는 본 발명이 적용되는 다층퍼셉트론 신경회로망의 구조도,FIG. 1 is a structural view of a multi-layer perceptron neural network to which the present invention is applied,

제2도는 본 발명에 따른 시그모이드 활성화 함수 특성도,FIG. 2 is a schematic diagram of a sigmoid activation function according to the present invention,

제3도는 본 발명에 따른 다층퍼셉트론의 일반적 역전파 학습방법 순서도,FIG. 3 is a flowchart of a general back propagation learning method of a multi-layer perceptron according to the present invention;

제4도는 본 발명에 따른 효율적 학습을 위한 오차신호 제안 특성도.FIG. 4 is a characteristic diagram of an error signal suggestion for efficient learning according to the present invention; FIG.

*도면의 주요부분에 대한 부호의 설명DESCRIPTION OF THE REFERENCE NUMERALS

1:전방향 계산 2:출력오차신호 계산1: Forward calculation 2: Output error signal calculation

3:오차신호의 역전파4:가중치 변경에 의한 학습3: back propagation of error signal 4: learning by weight change

본 발명은 패턴인식문제의 학습에 광범위하게 사용되는 다층퍼셉트론 신경회로망(multi-layer perceptorn neural networks) 모델의 효율적인 학습방법에 관한 것이다.The present invention relates to an efficient learning method of a multi-layer perceptorn neural networks model widely used for learning pattern recognition problems.

종래에는 이 다층퍼셉트론을 학습시킬 때, 학습시간이 많이 걸리거나, 몇몇 패턴에 대해서는 완전히 학습이 되지 않는 현상이 종종 나타난다.Conventionally, when learning the multi-layer perceptron, a phenomenon that it takes a lot of learning time, or that some patterns are not completely learned, often appears.

따라서, 이러한 문제를 해결한다면 다층퍼셉트론을 이용한 패턴인식문제의 학습시간을 단축하여 신속화를 도모하고, 학습패턴에 대해서도 그대로 학습할 수 있도록 하는 것을 목적으로 한다.Accordingly, it is an object of the present invention to shorten the learning time of the pattern recognition problem using the multi-layer perceptron to speed up learning and to learn the learning pattern as it is.

본 발명에 관한 다층퍼셉트론 신경회로망의 효율적인 학습을 위한 오차신호의 발생방법은, 생명체의 정보처리를 모방한 신경회로망 모델의 하나로서, 신경세포를 의미하는 노드와, 각 노드를 연결하는 연결접속부 가중치가 계층적으로 구성되어 있는 다층퍼셉트론으로부터 오차신호를 발생시키는 방법에 있어서, 상기 다층퍼셉트론의 역전파 학습시에 출력노드가 부적절하게 포화되는 경우에 강한 오차신호를 발생시키고, 상기 출력노드가 적절하게 포화된 경우에는 약한 오차신호를 발생시키는 것을 특징으로 한다.The error signal generation method for efficient learning of the multilayer perceptron neural network according to the present invention is one of neural network models that imitate the information processing of living organisms and includes a node for a neuron and a node for connecting a node A method for generating an error signal from a multi-layer perceptron configured hierarchically, comprising the steps of: generating a strong error signal when the output node is improperly saturated during back propagation learning of the multi-layer perceptron; And generates a weak error signal when it is saturated.

상기 하나의 다층퍼셉트론 신경회로망의 효율적인 학습을 위한 오차신호의 발생방법에 따르면, 역전파 학습시에 출력노드가 부적절하게 포화되는 경우에는 강한 오차신호를 발생시키고, 상기 출력노드가 적절하게 포화된 경우에는 약한 오차신호를 발생시키기 때문에 출력노드가 부적절하게 포화되는 현상이 줄어들어 신경망이 학습패턴을 과도하게 학습하는 것이 방지된다.According to the error signal generation method for efficiently learning the multi-layered PER neural network, when the output node is improperly saturated at the time of back propagation learning, a strong error signal is generated. When the output node is appropriately saturated A phenomenon in which the output node is inappropriately saturated is reduced, thereby preventing the neural network from learning the learning pattern excessively.

본 발명을 설명하기 위해 다음과 같이 용어를 정의한다.The following terms are defined to describe the present invention.

먼저, '다층퍼셉트론'이란 생명체의 정보처리를 모방한 신경로망 모델의 하나로서, 제1도에 도시된 바와 같이, 신경세포를 의미하는 뉴런(neuron) 노드와 노드들을 연결하는 시냅스 가중치(sunapse weight value)들이 계층적으로 구성되어 있다.First, 'Multi-layer perceptron' is a neural network model that imitates the information processing of living organisms. As shown in FIG. 1, a neuron node representing a neuron and a sunrise weight value are hierarchically structured.

이 다층퍼셉트론의 각 노드는 그 상태가 아래층 노드들의 상태값과 그 연결가중치들의 가중치 합을 입력을 받아들여, 제2도처럼 시그모이드 변환한 값을 출력한다.Each node of the multi-layer perceptron receives the input of the state value of the lower layer nodes and the sum of the weights of the connection weights, and outputs the sigmoid transformed value of the second round.

시그모이드 함수는 기울기가 작은 양측면의 포화영역과 기울기가 큰 중앙의 활성여역으로 나누어진다.The sigmoid function is divided into the saturation region on both sides with small slope and the active region in the center with large slope.

학습패턴이란 패턴인식 문제를 학습시키기 위해 임의로 수집한 패턴들이다.Learning patterns are randomly collected patterns to learn pattern recognition problems.

시험패턴이란 패턴인식 문제의 학습정도를 시험하는 기준으로 삼기 위해 임의로 수집한 패턴들이다.Test patterns are arbitrarily collected patterns in order to test the degree of learning of pattern recognition problems.

이들 패턴들은 여러 개의 집단으로 나눌수 있으며, 패턴인식이란, 입력된 패턴이 어느 집단에 속하는가를 결정하는 것이다.These patterns can be divided into several groups, and pattern recognition is to determine which group the input pattern belongs to.

최종계층 노드들의 상태가 입력패턴이 속하는 집단을 나타낸다.The state of the final layer nodes represents the group to which the input pattern belongs.

역전파학습이란 이 다층퍼셉트론을 학습시키는 방법으로서, 학습패턴을 입력시킨 후 최종계층노드의 출력값이 원하는 목표값이 나오도록 오차신호에 의해 최종계층노드의 출려값이 원하는 목표값으로 되도록 오차신호에 의해 최종계층 노드에 연결된 가중치들을 변경시켜 놓고, 그것의 하층의 노드는 윗계층으로부터 역전파된 오차신호에 의해 연결가중치를 변형시키는 방법이다.Backpropagation learning is a method of learning this multi-layer perceptron. In this method, after inputting the learning pattern, the error signal is set so that the output value of the final layer node becomes the desired target value by the error signal so that the output value of the final layer node becomes the desired target value. And the lower layer of the node changes the connection weight by the error signal propagated backward from the upper layer.

오차함수란 역전파학습으로부터 오차신호를 어떻게 발생시킬 것인가를 결정하는 함수이다.The error function is a function that determines how the error signal is generated from the backpropagation learning.

노드의 포화란 노드의 가중치의 합계의 입력값이 시그모이드 함수의 기울기가 작은 영역에 위치한 것을 말한다.Saturation of a node means that the input value of the sum of weights of the nodes is located in a region where the slope of the sigmoid function is small.

노드가 목표값과 같은 포화영역에 위치하면 적절한 포화, 반대쪽 포화영역에 위치하면 부적절한 포화라 한다.If the node is located in the same saturation region as the target value, it is called an appropriate saturation.

다층퍼셉트론의 역전파학습 알고리즘의 구체적인 내용은 제3도와 같다.The detailed contents of the back propagation learning algorithm of the multi-layer perceptron are the same as the third aspect.

학습패턴 x = [x₁, x₂, … x_No]이 입력되면, L층으로 이루어진 다층퍼셉트론은 전방향 계산(1)에 의해 1층의 j번째 노드상태가,The learning pattern x = [x ₁ , x ₂ , ... x _No ] is input, the j-th node state of the first layer is calculated by the forward calculation (1) in the multi-layer perceptron of the L layer,

와같이 결정된다..

여기서,here,

이며, Wj_i ^(l)은 X_i ^(l-1)과 X_j ^(l)사이의 연결가중치, W_jo ^(l)는 X_j ^(l)의 비아어스(bias)를 나타낸다., Wj _i ^(l) represents the connection weight between X _i ^(l-1) and X _j ^(l) , and W _jo ^(l) represents the bias of X _j ^(l) .

이와 같이 최종계층노드의 상태 Xk(L)이 구하여지면, 다층퍼셉트론의 오차함수는 입력패턴에 대한 목표패턴와의 관계에 의해When the state Xk (L) of the final hierarchical node is obtained in this way, the error function of the multilayer perceptron is expressed by the target pattern By relationship with

로 정의되며, 이 오차함수값을 줄이도록 오차신호가 발생되고, 이 오차신호에 따라 각 기중치들이 변경된다., An error signal is generated so as to reduce the error function value, and each of the weights is changed according to the error signal.

즉, 출력층의 오차신호(2)는That is, the error signal 2 of the output layer is

로 계산된다..

아래층의 오차신호는 오차신호의 역전파(3)에 의해The error signal of the lower layer is generated by the back propagation (3) of the error signal

로 계산된다..

그러면, 각 계층의 가중치들은(4)은Then, the weights of the layers are (4)

에 따라 변경되어 1개의 학습패턴에 대하여 학습이 이루어진다.And learning is performed on one learning pattern.

이 과정을 모든 학습패턴에 대하여 한번 수행한 것을 sweep라는 단위로 표시한다.This process is performed once for all learning patterns, and is displayed in sweep.

위에서 설명한 역전파 알고리즘에서, 오차신호 δ_k ^(L)은 목표값과 실제값의 차이에 시그모이드 활성화함수의 기울기가 곱해진 형태이다.In the back propagation algorithm described above, the error signal δ _k ^(L) is a form in which the difference between the target value and the actual value is multiplied by the slope of the sigmoid activation function.

만약 X_k ^(L)이 -1 혹은 +1에 가까운 값이면, 기울기에 대한 항 때문에, δ_k ^(L)은 아주 작은 값이 된다.If X _k ^(L) is a value close to -1 or +1, δ _k ^(L) becomes a very small value due to the slope term.

즉, t_K=1이고, X_R ^(L) -1인 경우 혹은 그 반대인 경우에, X_R ^(L)은 연결된 가중치들을 조정하기에 충분히 강한 오차신호를 발생시키지 못한다.That is, t _K = 1, X _R ^(L) -1, or vice versa, X _R ^(L) does not produce an error signal strong enough to adjust the associated weights.

이와 같은 출력노드의 부적절한 포화가 역전파학습에서 E_m의 최소화를 지연시키고 어떤 패턴의 학습을 방해한다.Such inadequate saturation of the output node delays minimization of E _m in backpropagation learning and interferes with learning of certain patterns.

본 발명은 학습을 위한 오차함수를The present invention provides an error function for learning

로 바꾸고, 이 오차함수를 이용하면 출력노드의 오차시호는, And using this error function, the error signal of the output node is

이 되도록 하는 것이다..

학습을 위한 다른 수식은 Em을 이용한 역전파 알고리즘과 동일하다.The other equation for learning is the same as the back propagation algorithm using Em.

제안된 오차함수를 이용한 역전파 알고리즘은 부적절하게 포화되는 출력노드는 강한 오차신호를 발생시키는 반면에, 목표값과 같은 방향으로 포화된 출력노드는 약한 오차신호를 발생시켜 출력노드의 부적절한 포화를 줄여주는 것과 동시에 학습패턴에 과도하게 학습되는 것을 막아준다.The backpropagation algorithm using the proposed error function improves the performance of the output node by reducing the inadequate saturation of the output node by generating a weak error signal while the saturating output node generates a strong error signal. And at the same time prevents learning too much from learning patterns.

제4도는 t_k=1인 경우에 X_k ^(L)에 따른 오차신호를 비교한 것으로서, 부호 CE로 나타낸 곡선은 종래의 오차함수에 의해 얻어진 오차신호를 나타내고, 부호 PE로 나타낸 곡선은 본 발명에서 제안된 오차함수에 의해 얻어진 오차신호를 나타낸다.FIG. 4 is a comparison of error signals according to X _k ^(L) when t _k = 1, wherein a curve denoted by CE represents an error signal obtained by a conventional error function, The error signal obtained by the error function proposed in Fig.

또한 학습을 위한 다른 수식은 오차함수 E_m을 이용한 종래의 역전파 알고리즘과 동일하다.Another equation for learning is the same as the conventional back propagation algorithm using the error function E _m .

본 발명에 관한 다층퍼셉트론 신경회로망의 효율적인 학습을 위한 오차신호의 발생방법에 따르면, 제안된 오차함수를 이용한 역전파 알고리즘은, 출력층의 목표값이 출력값과의 차이가 큰 차이로 되면 강한 오차신호를 발생시켜, 출력노드가 부적절하게 포화되는 현상을 줄이고, 출력층의 목표값이 출력값에 근접한 값으로 되면 약한 오차신호를 발생시켜 신경망이 학습패턴을 과도하게 학습하는 것을 방지하므로, 학습시간을 단축하여 신속화를 도모하고, 학습패턴에 대해서도 제대로 학습시킬 수 있는 효과를 가진다.According to the method of generating an error signal for efficient learning of the multilayer perceptron neural network according to the present invention, the back propagation algorithm using the proposed error function is characterized in that when the target value of the output layer is large in difference from the output value, It is possible to reduce the phenomenon of improper saturation of the output node and to generate a weak error signal when the target value of the output layer becomes close to the output value, thereby preventing the neural network from excessively learning the learning pattern, And learning patterns can be learned properly.

Claims

생명체의 정보처리를 모방한 신경회로망 모델의 하나로서, 뉴런을 노드들과 그 노드들을 연결하는 시냅스 가중치들로 구성되어 있는 다층퍼셉트론에서, 다층퍼셉트론의 출력층 노드에 대한 목표값과 실제 출력값에 의해 계산되는 오차함수의 값을 줄이기 위해 가증치를 오차함수의 편미분으로 주어지는 오차신호에 따라 변경시키는 학습 방법에 있어서, 상기 다층퍼셉트론 출력층 노드가 부적절하게 포화될수록 그 출력 노드는 강한 오차신호를 발생시키고, 상기 출력층 노드가 적절하게 포화될수록 약한 오차신호를 발생시키기 위해서, 출력층 노드의 오차신호 δk(L)를 하기의 수식으로 표현되는 목표값과 출력값 차이의 2차 함수가 되도록 하는 것을 특징으로 하는 다층퍼셉트론 신경회로망의 효율적인 학습을 위한 오차신호 발생방법.In a multilayer perceptron consisting of neurons as nodes and synaptic weights connecting the nodes, one of the neural network models imitating information processing of living things is calculated by the target value and the actual output value of the output layer node of the multilayer perceptron. Wherein the output node generates a strong error signal as the multi-layer perceptron output layer node is inappropriately saturated, and the output node generates a strong error signal when the multi-layer perceptron output layer node is improperly saturated, (L) of the output layer node to be a quadratic function of the difference between the target value and the output value represented by the following expression, in order to generate a weak error signal as the node is appropriately saturated. A method of error signal generation for efficient learning.

단, t_k는 목표값, 　x_k ^(L)는 출력값 However, t _k is the target value, and x _k ^(L)