KR100338473B1

KR100338473B1 - Face detection method using multi-dimensional neural network and device for the same

Info

Publication number: KR100338473B1
Application number: KR1019990026661A
Authority: KR
Inventors: 이필규; 라채우
Original assignee: 조양호; 학교법인 인하학원
Priority date: 1999-07-02
Filing date: 1999-07-02
Publication date: 2002-05-30
Also published as: KR20010008715A

Abstract

본 발명은 디지털 영상으로부터 사람의 얼굴을 검출하는 방법에 관한 것으로, 카메라로부터 입력된 디지털 영상에 대해 영상의 크기를 축소시키는 부분과 축소된 영상에 대해 전처리(preprocessing)를 수행하는 부분과 전처리된 영상에 대해 N 개의 검색 윈도우(search window)를 이용해 부-영상(sub-space)으로 샘플링(sampling)하는 부-영상 샘플링 부분과, 샘플링한 부-영상에 대해 다-차원 신경 회로망을 이용해 얼굴인가를 판단하는 판단 부분으로 구성됨을 특징으로 하며, 기존의 얼굴 검출 방법의 단점인 화소가 갖는 본래의 특징 소실로 인한 잘못된 얼굴 판별을 줄일 수 있으며, 영상 내 다양한 크기의 얼굴을 검출하는데 효과가 있는 것이다.The present invention relates to a method for detecting a face of a person from a digital image, wherein the portion of the image is reduced in size with respect to the digital image input from the camera, and the portion is preprocessed with the reduced image. A sub-image sampling portion for sampling into sub-spaces using N search windows and a multi-dimensional neural network for sampled sub-images. It is characterized in that it consists of a judgment portion to determine, it is possible to reduce the erroneous face discrimination due to the loss of the original feature of the pixel, which is a disadvantage of the conventional face detection method, it is effective in detecting faces of various sizes in the image.

Description

다-차원 신경 회로망을 이용한 얼굴 검출 방법 및 그 장치{Face detection method using multi-dimensional neural network and device for the same}Face detection method using multi-dimensional neural network and device for the same

본 발명은 디지털 영상으로부터 사람의 얼굴을 검출하는 방법에 관한 것으로 특히, 여러 번의 영상 축소로 인해 본래 화소가 갖는 특징을 잃어버리지 않으면서다양한 크기의 얼굴을 검출할 수 있도록 하기 위하여 카메라로부터 입력된 디지털영상에 대해 영상의 크기를 축소 시킨 후 축소된 영상에 대해 전처리(preprocessing) 및 얼굴 샘플 추출 과정을 통해 얼굴인가를 판단하도록 함으로써 기존의 얼굴 검출 방법이 갖고있는 단점인 화소가 갖는 본래의 특징을 소실하지 않으면서 영상내 다양한 크기의 얼굴을 검출하는 다-차원 신경 회로망을 이용한 얼굴 검출 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method for detecting a face of a person from a digital image. In particular, a digital input from a camera is provided so that a face of various sizes can be detected without losing the characteristic of the original pixel due to multiple image reductions. After reducing the size of the image to determine the face through the preprocessing and face sampling process for the reduced image to lose the original feature of the pixel, which is a disadvantage of the conventional face detection method The present invention relates to a face detection method using a multi-dimensional neural network that detects faces of various sizes in an image without using the same.

일반적으로, 전자 산업의 발전과 컴퓨터 및 주변기기의 발전에 힘입어 연구소 혹은 은행 등과 같은 공공기관에서는 점차 소수의 경비원만으로 운영 가능한 24시간 감시체제인 전자 감시 체제가 도입되고 있는 실정이다.In general, due to the development of the electronic industry and the development of computers and peripherals, public institutions such as research institutes or banks are gradually adopting the electronic surveillance system, which is a 24-hour monitoring system that can be operated by a small number of security guards.

이러한 전자 감시 체제중 흔히 알려진 CCTV 감시방식의 경우 은행이나 고속도로등에서 많이 사용되고 있는데, 은행의 경우 CCTV의 촬상 영역내에 활동하는 사람의 얼굴을 촬영/녹화하며 이를 경비원이 모니터링하고 있어 사건의 발생을 미연에 방지하거나, 사건 발생을 미연에 방지하지 못한 경우 녹화되어진 내용을 검색하여 사건 발생요인을 검사하거나 범인을 검거하는 자료로 사용되고 있다. 또한, 고속도로의 경우 녹화되어진 자료를 통해 과속차량의 검사를 수행하게 된다.CCTV surveillance system, which is commonly known among these electronic surveillance systems, is widely used in banks and highways.In the case of banks, security guards monitor the faces of people who are active in the image capturing area of CCTV. If it is not possible to prevent or prevent the occurrence of the incident, it is used as the data to search the recorded contents to inspect the occurrence factors of the incident or to arrest the criminal. In the case of the highway, the speeding vehicle is inspected through the recorded data.

이때, 녹화된 자료의 검사를 통해 범인의 얼굴을 알아내고자 하는 경우 가장 많이 사용되는 방식이 얼굴 템플리트를 이용한 상응도 계산 방법과 신경 회로망을 이용한 방법으로 구분할 수 있다.At this time, if you want to find out the face of the culprit through the inspection of the recorded data, the most commonly used method can be divided into the correspondence calculation method using the face template and the neural network method.

상술한 얼굴 검출 방식을 좀더 상세히 살펴보면, 우선 얼굴 템플리트를 이용한 상응도 계산 방법은 미리 영상에서 임의로 추출한 얼굴 템플리트(template)와영상내 모든 화소들과의 상응도(correlation)를 계산하여 가장 유사한 영역을 얼굴로 판단하는 방법이다.Looking at the above-described face detection method in detail, first, the correspondence calculation method using the face template calculates the correlation between the face template randomly extracted from the image in advance and all the pixels in the image to determine the most similar area. Judging by the face.

이때, 상술한 상응도 계산 방법은 설치비용이 저렴하다는 장점을 가지고 있으나, 다양한 크기 및 형태의 얼굴을 검출할 수 없으며 얼굴 템플리트가 여러 개일 경우 모든 얼굴 템플리트와 영상과의 상응도를 계산하여야 하므로 계산 시간이 많이 소요된다는 단점을 가지고 있다.At this time, the above-mentioned correspondence calculation method has an advantage that the installation cost is low, but it is impossible to detect faces of various sizes and shapes, and if there are multiple face templates, the correspondence between all face templates and images should be calculated. The disadvantage is that it takes a lot of time.

상기 상응도 계산 방법과 함께 대표적으로 사용되고 있는 신경 회로망을 이용한 방법은 얼굴 템플리트를 신경 회로망에 학습시킨 후 얼굴을 검출하는 방법인데, 이는 신경 회로망을 이용할 경우 학습된 얼굴에 대해 비교적 유사한 얼굴 영역을 판단할 수 있으므로 근래에 많이 사용되는 방법이다.The neural network, which is typically used together with the correspondence calculation method, is a method of detecting a face after training a face template to the neural network, which is used to determine a relatively similar face region with respect to the learned face. It is a method that is used a lot recently.

이 방법은 다음의 두 가지 방법으로 나눌 수 있다.This method can be divided into two ways.

첫 번째 방법으로는 입력 영상을 N 개의 부-영상으로 구성하고, N 개의 부-영상에 대해 일정한 크기의 영상을 샘플링 한 후, 샘플링한 영상을 신경 회로망에 입력하는 방법이다. 이때, N 개의 부-영상을 구성하는 방법은 영상을 각기 다른 임의의 크기로 축소하는 것이다.In the first method, an input image is composed of N sub-images, a sample of a certain size of N sub-images is sampled, and the sampled image is input to a neural network. In this case, the method of configuring N sub-images is to reduce the image to different arbitrary sizes.

그러나, 상술한 방법은 영상 축소시 원래 화소가 갖는 화소들의 특징 소실로 인해 얼굴 위치를 오판할 수 있다는 것으로, 화소가 갖는 특징은 사람의 얼굴일 경우 눈, 코, 입 등으로 구성되며, 이는 사람의 얼굴임을 판단하는 중요한 정보가 된다. 따라서 이러한 특징의 소실은 얼굴을 판단하는데 가장 중요한 정보를 잃게되는 문제점을 내포하고 있다.However, in the above-described method, when the image is reduced, the position of the face may be misjudged due to the loss of features of the pixels of the original pixel, and the feature of the pixel is composed of eyes, nose, mouth, etc. in the case of a human face. This is important information to determine the face of the. Therefore, the loss of this feature implies the loss of the most important information in determining the face.

두 번째 방법은 첫 번째 방법과 유사하나 N 개의 부-영상을 구성할 때, 영상 축소가 아닌 다양한 크기의 모자이크로 구성하는 방법이다. 모자이크는 신경 회로망에 입력하는 영상을 정규화 시킬 수 있다는 장점이 있으나, 영상을 희미하게(blurring)만듦으로써, 마찬가지로 원래 화소가 갖는 화소들의 특징 소실로 인해 얼굴 위치를 오판할 수 있다는 문제점이 발생되었다.The second method is similar to the first method, but when N sub-images are composed, mosaics of various sizes are used instead of image reduction. Mosaic has the advantage that it can normalize the image input to the neural network, but by blurring the image, there is a problem that the position of the face can be misjudged due to the loss of features of the original pixels.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은 여러 번의 영상 축소로 인해 본래 화소가 갖는 특징을 잃어버리지 않으면서 다양한 크기의 얼굴을 검출할 수 있도록 하기 위하여 카메라로부터 입력된 디지털 영상에 대해 영상의 크기를 축소시킨 후 축소된 영상에 대해 전처리(preprocessing) 및 얼굴 샘플 추출과정을 통해 얼굴인가를 판단하도록 함으로써 기존의 얼굴 검출 방법이 갖고있는 단점인 화소가 갖는 본래의 특징을 소실하지 않으면서 영상내 다양한 크기의 얼굴을 검출하는 다-차원 신경 회로망을 이용한 얼굴 검출 방법을 제공하는데 있다.SUMMARY OF THE INVENTION An object of the present invention to solve the above problems is that the image of the digital image input from the camera in order to be able to detect the face of various sizes without losing the characteristics of the original pixels due to the reduction of the number of images After reducing the size, the image is processed by preprocessing and face sampling to reduce the size of the reduced image, without losing the original feature of the pixel, which is a disadvantage of the conventional face detection method. The present invention provides a face detection method using a multi-dimensional neural network for detecting faces of various sizes.

도 1은 전방향 신경회로망을 설명하기 위한 예시도1 is an exemplary diagram for explaining an omnidirectional neural network

도 2는 본 발명에 따른 다-차원 신경 회로망을 이용한 얼굴 검출 방법을 적용한 시스템 구성 예시도2 is an exemplary system configuration applying a face detection method using a multi-dimensional neural network according to the present invention

도 3은 다-차원 신경 회로망을 이용한 얼굴 검출 동작 과정의 예시도3 is an exemplary diagram of a face detection operation process using a multi-dimensional neural network.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

100 : 카메라 200 : 얼굴데이터 취득부100: camera 200: face data acquisition unit

210 : A/D 변환기 220 : 화면 축소부210: A / D converter 220: screen reduction

230 : 전처리부 240a∼240c : 영상필터230: preprocessor 240a to 240c: image filter

250 : 다차원 신경회로망 260 : 임계범위 판단부250: multidimensional neural network 260: critical range determination unit

300 : 얼굴 데이터 저장부300: face data storage unit

상기와 같은 목적을 달성하기 위한 본 발명의 특징은, 카메라로부터 취득된 영상신호에서 특정 사물을 인식하는 방법에 있어서, 상기 카메라로부터 취득된 영상신호를 소정 비율로 축소하는 제 1과정과, 상기 제 1과정을 통해 축소되어진 영상신호를 소정 개수의 부화면 영역으로 분할하는 제 2과정과, 상기 제 2과정을 통해 분할되어진 부화면 영역중 특정 영역을 샘플링한 후 특정 사물에 대한 학습이 완료되어 있는 신경회로망의 입력패턴으로 제공하는 제 3과정, 및 상기 제 3과정이후 신경회로망의 인식패턴을 기 설정되어 있는 임계치와 비교하여 임계치보다 작지 않은 경우 얼굴을 인식한 것으로 판단하여 해당 부화면 영역을 저장하는 제 4과정을 포함하는 데 있다.A feature of the present invention for achieving the above object is a method for recognizing a particular object from an image signal acquired from a camera, the first process of reducing the image signal obtained from the camera at a predetermined ratio, A second process of dividing the video signal reduced in step 1 into a predetermined number of sub-screen areas; and learning a specific object after sampling a specific area of the sub-screen areas divided through the second process; Comparing the recognition pattern of the neural network with the preset threshold after the third process, which is provided as an input pattern of the neural network, and determines that the face is recognized when the face is not smaller than the preset threshold and stores the corresponding sub-screen area To include a fourth process.

상기와 같은 목적을 달성하기 위한 본 발명의 다른 특징은, 카메라로부터 취득된 영상신호에서 특정 사물을 인식하는 장치에 있어서, 특정 영역의 화상을 촬상하여 독취하는 카메라와, 상기 카메라로부터 취득되어진 화상데이터를 소정크기로 축소하여 출력하는 화면 축소부와, 상기 화면 축소부를 통해 축소되어진 화상에 섞여있는 노이즈를 제거하는 전처리부와, 상기 전처리부에서 노이즈가 제거된 화상신호를 입력받아 설정되어 있는 소정 영역의 화상만 통과시키는 복수개의 영상필터와, 상기 영상필터들로 부터 출력되는 다수개의 샘플링 화상 데이터를 입력받아에 따른 전방향 전파를 통하여과 같은 출력층의 총 오차를 계산하고 이를 최소화하기 위하여,According to another aspect of the present invention for achieving the above object, a device for recognizing a specific object in a video signal acquired from a camera, the camera for picking up and reading an image of a specific area, and the image data obtained from the camera A screen reduction unit for reducing the output to a predetermined size, a preprocessing unit for removing noise mixed in the image reduced through the screen reduction unit, and a predetermined area configured to receive an image signal from which the noise is removed from the preprocessing unit A plurality of image filters for passing only an image of a plurality of pieces of data, and a plurality of sampling image data outputted from the image filters Through omnidirectional propagation according to To calculate and minimize the total error of the output layer,

이와 같이 각 은닉층 뉴런의 값에 대하여 출력층 오차를 미분하여 오차가 작아지는 방향으로을 통해 시냅스 가중치를 변경하는 과정을 P개의 학습 패턴에 대하여 반복적으로 적용하는 다차원 신경회로망과, 상기 다차원 신경회로망에서 출력되는 학습패턴 및 인식패턴들을 순차적으로 입력받아 기 설정되어 있는 임계범위에 속하는 가를 판단하는 임계범위 판단부, 및 상기 임계범위 판단부에서 취득되어진 얼굴 데이터를 저장하는 얼굴 데이터 저장부를 포함하는 데있다.In this way, the output layer error is differentiated with respect to the value of each hidden layer neuron so that the error becomes smaller. The multi-dimensional neural network repeatedly applies the process of changing the synaptic weight through the P learning patterns, and whether the learning pattern and the recognition patterns outputted from the multi-dimensional neural network are sequentially input and belong to a preset threshold range. It includes a threshold range determination unit for determining, and a face data storage unit for storing the face data obtained by the threshold range determination unit.

본 발명의 상술한 목적과 여러 가지 장점은 이 기술 분야에 숙련된 사람들에 의해 첨부된 도면을 참조하여 후술되는 발명의 바람직한 실시 예로부터 더욱 명확하게 될 것이다.The above object and various advantages of the present invention will become more apparent from the preferred embodiments of the present invention described below with reference to the accompanying drawings by those skilled in the art.

우선, 본 발명을 설명하기에 앞서 본 발명에서 적용되고 있는 신경회로망의 전처리 방식에 대하여 간략히 살펴보기로 한다.First, before describing the present invention, a brief description will be made of the preprocessing method of the neural network applied in the present invention.

일반적인 다층 퍼셉트론은 첨부한 도 1에 도시되어 있는 바와 같이 입력층과 출력층 사이에 하나 이상의 중간층(은닉층)이 존재하는 층 구조의 신경회로망으로 몇 개의 단층 퍼셉트론이 직렬로 연결된 형태이다.A general multilayer perceptron is a layered neural network in which one or more intermediate layers (hidden layers) exist between an input layer and an output layer, as shown in FIG. 1, in which several single layer perceptrons are connected in series.

입력층에 인가된 입력값은 각 입력 뉴런에 연결된 시냅스 가중치가 곱해져 인접한 은닉층의 뉴런 별로 그 합이 계산되어 이 뉴런의 출력값이 다시 다음 은닉층의 입력이 되는 형태로 차례로 출력층까지 전달된다. 즉번째 은닉층의 j번째 뉴런의 입력은 아래의 수학식 1과 같이 계산된다.The input value applied to the input layer is multiplied by the synaptic weights connected to each input neuron, the sum is calculated for each neuron of the adjacent hidden layer, and the output value of this neuron is transferred to the output layer in order to be input to the next hidden layer. In other words Input of j th neuron of the first hidden layer Is calculated as in Equation 1 below.

상기 수학식 1에서는의 바이어스인데,이다. 또한,는 l-1번째 은닉층의 k번째 뉴런과번째 은닉층의 j번째 뉴런을 연결하는 시냅스 가중치를 의미하며, 또한은 l-1번째 은닉층의 k번째 뉴런의 출력값을 의미한다. 또한, 변수 N은 l-1번째 은닉층의 은닉 뉴런의 개수를 의미한다.In Equation 1 Is Is the bias of to be. Also, Is the kth neuron of the l-1th hidden layer Synaptic weights connecting the jth neurons of the first hidden layer, Denotes the output value of the k-th neuron of the l-1th hidden layer. In addition, the variable N means the number of hidden neurons of the l-1 th hidden layer.

따라서, 은닉층의 뉴런의 입력이과 같이 주어졌을 때 그 뉴런의 출력은 아래의 수학식 2와 같이 정의할 수 있다.Therefore, the input of neurons in the hidden layer When given as, the output of the neuron can be defined as Equation 2 below.

상술한 바와 같은 구조를 갖는 다층 퍼셉트론이 인식기로써 올바르게 동작하기 위해서는 각 뉴런을 연결하는 시냅스 가중치들이 적당한 값으로 조절되어야 한다는 선행 조건이 따르게 되는데, 이러한 가중치의 조절 과정이 다층 퍼셉트론의 학습 과정이며 이는 오차 역전파법에 의하여 층별로 계산된다.In order for a multi-layered perceptron having the structure as described above to function correctly as a recognizer, the precondition is that synaptic weights connecting each neuron must be adjusted to an appropriate value. Calculated by floor by backpropagation.

다층 퍼셉트론의 학습은 P개의 학습 패턴을 입력으로 받아들이고 각각의 학습 패턴에 해당하는 원하는 출력값을 출력층의 목표치로 설정하여 출력층의 실제 출력값과 목표치 사이의 MSE를 최소로하는 시냅스 가중치를 구함으로써 이루어진다.Multilayer perceptron learning is achieved by taking P learning patterns as inputs and setting the desired output values corresponding to each learning pattern as target values of the output layers to obtain synaptic weights that minimize the MSE between the actual output values and the target values of the output layers.

그러므로, P개의 학습패턴 와그에 따른 출력 벡터, 목표 벡터에 대한 MSE는 아래의 수학식 3과 같이 계산된다.Therefore, P learning patterns The resulting output vector , Goal vector MSE for is calculated as in Equation 3 below.

이때, 상기 수학식 3의 MSE를 최소화시키기 위하여 오차 역전파법에서는 출력 층의 가중치를 아래의 수학식 4와 같은 방법으로 반복적으로 적용하게된다.In this case, in order to minimize the MSE of Equation 3, in the error backpropagation method, the weight of the output layer is repeatedly applied in the same manner as in Equation 4 below.

상기 수학식 4에서 변수는 학습률을 나타내며,는 각 은닉층 뉴런값에 대한 출력층 오차의 미분치를 나타내는 것인데, 상기 출력층 오차의 미분치를 수학식으로 정리하면 아래의 수학식 5와 같이 정의된다.Variable in Equation 4 Represents the learning rate, Denotes the derivative of the output layer error with respect to each hidden layer neuron value. The derivative of the output layer error is defined as shown in Equation 5 below.

상기 수학식들을 기준으로 기존의 오차 역전파법을 요약하면, 주어진 입력 벡터와 목표 벡터에 대하여 상기 수학식 1에 따른 전방향 전파를 통하여 상기 수학식 3과 같은 출력층의 총오차를 계산하고 이를 최소화하기 위하여 상기 수학식 5와 같이 각 은닉층 뉴런의 값에 대하여 출력층 오차를 미분하여 오차가 작아지는 방향으로 상기 수학식 4을 통해 시냅스 가중치를 변경하는 과정을 P개의 학습 패턴에 대하여 반복적으로 적용하는 알고리즘이다.Summarizing the conventional error backpropagation based on the above equations, the total error of the output layer as shown in Equation 3 is minimized by omnidirectional propagation according to Equation 1 for a given input vector and a target vector. In order to differentiate the output layer error with respect to the value of each hidden layer neuron as shown in Equation 5, the process of changing the synaptic weight through Equation 4 repeatedly is applied to the P learning patterns. .

이와 같은, 전처리과정에 다-차원 신경 회로망을 이용한 본 발명의 얼굴 검출 과정에 대한 과정을 첨부한 도 2내지 도 3을 참조하여 살펴보면 다음과 같다.As described above, referring to FIGS. 2 to 3 attached to the preprocessing process for the face detection process of the present invention using a multi-dimensional neural network as follows.

도 2는 본 발명에 따른 다-차원 신경 회로망을 이용한 얼굴 검출 방법을 적용한 시스템 구성 예시도로서, 특정 영역의 화상을 촬상하여 독취하는 카메라(100)와, 상기 카메라(100)로부터 취득되어진 영상에서 얼굴로 판단되는 데이터를 부화면 기법과 다차원 신경회로망을 이용하여 검출하는 얼굴데이터 취득부(200), 및 상기 얼굴데이터 취득부(200)에서 취득되어진 얼굴 데이터를 저장하는 얼굴 데이터저장부(300)로 크게 구성되어 있다.FIG. 2 is a diagram illustrating a system configuration to which a face detection method using a multi-dimensional neural network according to the present invention is applied, in which a camera 100 for capturing and reading an image of a specific region and an image acquired from the camera 100 are illustrated in FIG. Face data acquisition unit 200 for detecting data determined as a face using a sub-screen technique and a multi-dimensional neural network, and face data storage unit 300 for storing face data acquired by face data acquisition unit 200. It is composed largely.

상기 얼굴데이터 취득부(200)의 세부 구성을 살펴보면, 상기 카메라(100)로부터 취득되어진 아날로그 화상을 신호처리를 위하여 디지털 데이터로 변환하는 A/D 변환기(210)와, 상기 A/D 변환기(210)에서 출력되는 디지털 데이터를 소정크기로 축소하여 출력하는 화면 축소부(220)와, 상기 화면 축소부(220)를 통해 축소되어진 화상에 섞여있는 노이즈를 제거하는 전처리부(230)와, 상기 전처리부(230)에서 노이즈가 제거된 화상신호를 입력받아 설정되어 있는 소정 영역의 화상만 통과시키는 복수개의 영상필터(240a∼240c)와, 상기 영상필터(240a∼240c)들로 부터 출력되는 샘플링된 다수개의 화상 데이터를 상기 수학식 1 내지 수학식 5의 과정을 통해 학습 및 인식하는 다차원 신경회로망(250), 및 상기 다차원 신경회로망(250)에서 출력되는 학습패턴 및 인식패턴들을 순차적으로 입력받아 기 설정되어 있는 임계범위에 속하는 가를 판단하는 임계범위 판단부(260)로 구성되어진다.Looking at the detailed configuration of the face data acquisition unit 200, the A / D converter 210 for converting the analog image acquired from the camera 100 into digital data for signal processing, and the A / D converter 210 A screen reduction unit 220 for reducing and outputting the digital data outputted by the predetermined size), a preprocessor 230 for removing noise mixed in the image reduced by the screen reduction unit 220, and the preprocessing. A plurality of image filters 240a to 240c for receiving only the image of a predetermined region which receives the image signal from which noise is removed from the unit 230, and sampled outputs from the image filters 240a to 240c. The multidimensional neural network 250 for learning and recognizing a plurality of image data through the process of Equations 1 to 5, and the learning patterns and recognition patterns output from the multidimensional neural network 250 are sequentially It is composed of a threshold range determination unit 260 for determining whether the input belongs to a predetermined threshold range.

이때, 상기 얼굴데이터 취득부(200)의 구성중 A/D 변환기(210)는 상기 카메라(100)가 아날로그 카메라인 경우에 한하여 사용되는 것으로, 상기 카메라(100)가 디지털 카메라인 경우에는 사용되지 않는다.In this case, the A / D converter 210 of the face data acquisition unit 200 is used only when the camera 100 is an analog camera, and is not used when the camera 100 is a digital camera. Do not.

또한, 상기 임계범위 판단부(260)에서는 상기 신경회로망(250a∼250c)들에서 출력되는 학습패턴 및 인식패턴을 순차적으로 입력받는데, 임의의 신경회로망으로부터 입력받은 인식패턴이 임계범위내에 속한다고 판단되는 데이터만을 상기 얼굴데이터 저장부(300)에 저장시키고, 임계범위를 벗어난다고 판단되는 경우 다음 신경회로망으로부터 새로운 인식패턴을 입력받게 된다.In addition, the threshold range determination unit 260 sequentially receives the learning pattern and the recognition pattern output from the neural networks 250a to 250c, and determines that the recognition pattern received from any neural network falls within the threshold range. Only the data is stored in the face data storage unit 300, and if it is determined to be out of the threshold range, a new recognition pattern is received from the next neural network.

상기와 같이 구성되는 본 발명에 따른 다-차원 신경 회로망을 이용한 얼굴 검출 방법을 적용한 시스템의 바람직한 동작예를 도 3을 참조하여 살펴보기로 한다.An exemplary operation of the system to which the face detection method using the multi-dimensional neural network according to the present invention configured as described above will be described with reference to FIG. 3.

도 3은 첨부한 도 2에 도시되어 있는 시스템의 각 구성별 동작을 전체적으로 고시한 예시도로서, 스텝 S101의 과정은 A/D변환기(210)에서 카메라(100)를 통해 입력된 영상신호를 디지털 영상으로 변환하여 출력하는 것이며, 이렇게 디지털 데이터로 변환되어 출력되는 영상데이터를 입력받은 화면 축소부(220)는 스텝 S102의 과정으로 통해 영상을 축소하게 된다.FIG. 3 is an exemplary diagram in which the operation of each component of the system shown in FIG. 2 is notified in general. In step S101, an image signal input through the camera 100 is digitally input from the A / D converter 210. The image reduction unit 220 receives the image data, which is converted into digital data and outputted as described above, and reduces the image through the process of step S102.

상기 스텝 S102의 과정을 통해 축소된 영상에 대해 불필요한 노이즈 제거 및 조명 보상(illumination compensation)을 스텝 S103의 과정을 통해 수행하고, 상기 스텝 S103의 과정을 수행하는 전처리부(230)의 출력신호는 다수개의 필터(240a∼240c)에 동시에 입력된다.The output signal of the preprocessing unit 230 performing unnecessary noise removal and illumination compensation on the image reduced through the process of step S102 through the process of step S103 and performing the process of step S103 has a plurality of values. Input to two filters 240a to 240c simultaneously.

상기 필터(240a∼240c)는 그 각각이 영상의 특정 영역에 대한 화상만을 검출하여 출력하는 기능을 하는 것으로 첨부한 도 3의 스텝 S104a 내지 스텝 S104c의 과정에서와 같이 해당 영상에 대해서 N 개의 검색 윈도우를 이용하여 부-영상을 샘플링하는 것과 동일한 기능을 수행하는 것이다.Each of the filters 240a to 240c has a function of detecting and outputting only an image of a specific region of an image, as in the steps S104a to S104c of FIG. It performs the same function as sampling a sub-picture using.

이후, 스텝 S104a 내지 스텝 S104c의 과정을 통해 동시에 출력되는 샘플링된 부-영상 데이터는 도 2의 다차원 신경회로망(250)에 입력되는데, 상기 다차원 신경회로망(250)에서는 도 3에 도시되어 있는 스텝 S105a 내지 스텝 S105c의 과정을 통해 얼굴 검출을 수행하게 된다.Thereafter, the sampled sub-image data simultaneously output through the processes of steps S104a to S104c are input to the multidimensional neural network 250 of FIG. 2, and in the multidimensional neural network 250, step S105a shown in FIG. 3. The face detection is performed through the process of step S105c.

이때, 다차원 신경 회로망의 정의를 살펴보면, 다차원 신경 회로망이란 신경 회로망에 입력되는 입력 패턴들의 차원을 말하는 것으로, 차원(dimension)의 의미는 하나의 신경망에 입력되는 입력 패턴의 개수를 말하는 것입니다. 이러한 입력 패턴들을 취득(sampling)하는 부분이 바로 도 2에 도시되어 있는 필터들입니다. 필터부에서는 검색윈도우(search window)를 이용하여 원 영상을 부-영상(sub-space)으로 샘플링합니다. 즉 검색 윈도우의 크기(scale)가 바로 신경망에 입력되는 입력 패턴들의 차원과 대응됩니다.In this case, referring to the definition of the multidimensional neural network, the multidimensional neural network refers to the dimension of input patterns input to the neural network, and the meaning of the dimension refers to the number of input patterns input to one neural network. The parts that sample these input patterns are the filters shown in FIG. The filter unit samples the original image into sub-spaces using a search window. That is, the scale of the search window corresponds directly to the dimensions of the input patterns that enter the neural network.

따라서, 상기 다차원 신경회로망(250)은 상기 수학식 1 내지 수학식 5까지의 과정을 통해 기 학습된 학습 패턴에 따라 상기 필터(240a∼240c)에서 출력되는 부화면 영상을 입력으로 얼굴 검출을 수행하는 것이다.Therefore, the multi-dimensional neural network 250 performs face detection by inputting sub-screen images output from the filters 240a to 240c according to a learning pattern previously learned through the processes of Equations 1 to 5 below. It is.

이후, 스텝 S106에서는 상기 스텝 S105a 내지 스텝 S105c의 과정을 출력되는 인식 패턴의 값을 기설정되어 있는 임계치와 비교하여 임계치보다 크지 않다고 판단되면 얼굴이 아닌 것으로 판단하고, 스텝 S107로 진행하여 다음 신경회로망의 출력신호를 입력받아 재 검사하게 된다.Subsequently, in step S106, if it is determined that the value of the recognition pattern output from the process of steps S105a to S105c is not larger than a preset threshold, it is determined that the face is not a face, and the process proceeds to step S107 to the next neural network. The output signal of is input and retested.

즉, 영상의 좌측 상단부에서 우측하단부까지 전체적인 샘플링치에 대하여 얼굴을 인식 할 때까지 계속 반복하여 수행을 종료하면 얼굴로 판단된 부분의 영역 위치가 결과로 저장되는 것이다.That is, if the execution is repeated repeatedly until the face is recognized from the upper left to the lower right of the image until the face is recognized, the location of the region of the part determined as the face is stored as a result.

이러한 과정은 다음의 단계로 요약된다.This process is summarized in the following steps.

단계 1. 카메라로부터 입력된 입력 디지털 영상에 대해 영상 축소.Step 1. Reduce the image for the input digital image input from the camera.

단계 2. 축소된 영상에 대해 전처리 수행(노이즈 제거, 조명 보상).Step 2. Perform preprocessing on the reduced image (noise removal, light compensation).

단계 3. N 개의 검색 윈도우를 이용한 검색 윈도우를 이용한 부-영상 샘플링 수행.(N 개의 검색 윈도우의 크기는 신경 회로망에 입력되는 부-영상의 크기와 같고, 차원에 따라 다르며, 검색 범위는 축소된 영상의 좌측-상단에서 우측-하단까지 검색한다)Step 3. Perform sub-image sampling using the search window using N search windows. (The size of the N search windows is the same as the size of the sub-image input to the neural network, varies depending on the dimension, and the search range is reduced. Search from left-top to right-bottom of the image)

단계 4. 샘플링한 부-영상을 그 부-영상에 해당하는 신경 회로망에 입력하여 결과 출력.Step 4. Input the sampled sub-image into the neural network corresponding to the sub-image and output the result.

단계 5. 신경 회로망의 출력값이 임계값과 같거나 크면 얼굴로 판단하고, 작으면 얼굴이 아닌 것으로 판단.Step 5. If the output value of the neural network is equal to or greater than the threshold, it is judged to be a face, and if it is small, it is not a face.

단계 6. 검색 윈도우가 영상의 우측-하단까지 검색할 때까지 단계 3, 단계 4, 단계 5를 반복.Step 6. Repeat Step 3, Step 4, and Step 5 until the search window searches to the right-bottom of the image.

이상 설명한 내용을 통해 당업자라면 본 발명의 기술 사상을 일탈하지 않는 범위에서 다양한 변경 및 수정 실시가 가능함을 알 수 있을 것이다.Those skilled in the art will appreciate that various changes and modifications can be made without departing from the spirit of the present invention.

이상 설명한 바와 같이 본 발명에 따르면, 컴퓨터 시각을 기반으로 하는 얼굴 인식 분야에서 기존의 얼굴 검출 방법이 갖고 있는 단점인 화소가 갖는 본래의 특징 소실로 인한 얼굴 오판을 줄일 수 있으며, 영상내 다양한 크기의 얼굴을 검출하는데 효과가 있다.As described above, according to the present invention, in the face recognition field based on computer vision, face misjudgment caused by the loss of original features of pixels, which is a disadvantage of the conventional face detection method, can be reduced. It is effective in detecting faces.

Claims

카메라로부터 취득된 영상신호를 신경회로망의 인식패턴을 통해 특정 사물을 인식하는 방법에 있어서:In the method for recognizing a specific object through the image recognition signal obtained from the camera through the neural network recognition pattern:

카메라로부터 입력된 입력 디지털 영상에 대해 영상을 축소하는 제 1단계와;A first step of reducing an image with respect to an input digital image input from a camera;

상기 제 1단계에서 축소된 영상에 대해 노이즈 제거와 조명 보상등의 전처리를 수행하는 제 2단계와;A second step of performing preprocessing such as noise removal and lighting compensation on the image reduced in the first step;

그 크기는 신경 회로망에 입력되는 부-영상의 크기와 같고, 차원에 따라 다른 N 개의 검색 윈도우를 이용하여 축소된 영상의 좌측-상단에서 우측-하단까지 부-영상 샘플링을 수행 검색하는 제 3단계와;The size is equal to the size of the sub-image input to the neural network, and a third step of performing sub-image sampling from the left-top to the right-bottom of the reduced image using N search windows according to dimensions Wow;

상기 제 3단계를 통해 샘플링한 부-영상을 그 부-영상에 해당하는 다층 퍼셉트론 신경 회로망에 입력하여 결과를 산출하는 제 4단계와;A fourth step of inputting a sub-image sampled through the third step into a multilayer perceptron neural network corresponding to the sub-image and calculating a result;

상기 제 4단계에서 산출되어진 신경 회로망의 출력값이 임계값과 같거나 크면 얼굴로 판단하고, 작으면 얼굴이 아닌 것으로 판단하는 제 5단계; 및A fifth step of determining that the output value of the neural network calculated in the fourth step is equal to or greater than a threshold value and a face, and determining that the output value of the neural network is not a face; And

상기 제 3단계에서 사용되는 검색 윈도우가 영상의 전체 영상범위내에서 우측-하단까지 검색할 때까지 상기 제 3 내지 제 5단계를 반복수행하는 제 6단계를 포함하는 것을 특징으로 하는 다-차원 신경 회로망을 이용한 얼굴 검출 방법.And a sixth step of repeating the third to fifth steps until the search window used in the third step searches the right-bottom area within the entire image range of the image. Face detection method using a network.

카메라로부터 취득된 영상신호에서 특정 사물을 인식하는 장치에 있어서,In the device for recognizing a specific object from the video signal obtained from the camera,

특정 영역의 화상을 촬상하여 독취하는 카메라와;A camera for capturing and reading an image of a specific area;

상기 카메라로부터 취득되어진 화상데이터를 소정크기로 축소하여 출력하는 화면 축소부와;A screen reduction unit which reduces and outputs image data acquired from the camera to a predetermined size;

상기 화면 축소부를 통해 축소되어진 화상에 섞여있는 노이즈를 제거하는 전처리부와;A preprocessing unit to remove noise mixed in the image reduced by the screen reduction unit;

상기 전처리부에서 노이즈가 제거된 화상신호를 입력받아 설정되어 있는 소정 영역의 화상만 통과시키는 복수개의 영상필터와;A plurality of image filters which pass only an image of a predetermined area which is input by receiving an image signal from which the noise is removed by the preprocessor;

상기 영상필터들로 부터 출력되는 다수개의 샘플링 화상 데이터를 입력받아에따른 전방향전파를 통하여과 같은 출력층의 총오차를 계산하고 이를 최소화하기 위하여와 같이 각 은닉층 뉴런의 값에 대하여 출력층 오차를 미분하여 오차가 작아지는 방향으로을 통해 시냅스 가중치를 변경하는 과정을 P개의 학습 패턴에 대하여 반복적으로 적용하는 다차원 신경회로망과;Receives a plurality of sampling image data output from the image filters Through omnidirectional propagation To calculate the total error of the output layer such as Differentiate the output layer error with respect to the value of each hidden layer neuron, A multi-dimensional neural network which repeatedly applies the process of changing the synaptic weight through P learning patterns through;

상기 다차원 신경회로망에서 출력되는 학습패턴 및 인식패턴들을 순차적으로 입력받아 기 설정되어 있는 임계범위에 속하는 가를 판단하는 임계범위 판단부; 및A threshold range determination unit configured to sequentially receive the learning patterns and the recognition patterns output from the multidimensional neural network and determine whether they belong to a preset threshold range; And

상기 임계범위 판단부에서 취득되어진 얼굴 데이터를 저장하는 얼굴 데이터 저장부를 포함하는 것을 특징으로 하는 다-차원 신경 회로망을 이용한 얼굴 검출 장치.And a face data storage unit for storing face data acquired by the threshold range determination unit.