KR20060020998A

KR20060020998A - Method for detecting a face and apparatus thereof

Info

Publication number: KR20060020998A
Application number: KR1020040069778A
Authority: KR
Inventors: 김원중
Original assignee: (주)제니텀 엔터테인먼트 컴퓨팅
Priority date: 2004-09-02
Filing date: 2004-09-02
Publication date: 2006-03-07

Abstract

본 발명은 사람의 얼굴을 포함하는 동영상 또는 정지영상이나 이미지 등의 디지털 영상에서 최적 얼굴 영역을 추출하는 방법에 관한 것으로서, 특히 모바일폰, 디지털 카메라, 웹 카메라등의 다양한 영상 획득 도구로부터 획득한 영상의 얼굴 영역 추출의 고속화/최적화 방법을 제공한다.The present invention relates to a method for extracting an optimal face area from a digital image such as a moving picture or a still image or an image including a human face, and in particular, an image obtained from various image acquisition tools such as a mobile phone, a digital camera, and a web camera. A method for speeding up / optimizing facial region extraction is provided.

입력 영상을 몇 개의 작은 영역으로 분할하여 분할된 각 영역의 특징정보를 생성 하고, 다시 생성된 특징값을 학습하여 학습된 결과를 이용해서 얼굴 영역을 추출하는 방법에 관한 것이다. The present invention relates to a method for generating feature information of each divided region by dividing an input image into several small regions, and extracting a facial region using the learned results by learning the generated feature values.

; 본 발명은 입력 영상/이미지에 대한 전처리 과정, 영상 추출을 위한 특징값 생성과정, 미리 학습된 결과를 이용 얼굴과 얼굴이 아닌 영역으로 분류하는 과정, 얼굴을 추출하는 과정, 얼굴을 학습시키는 과정이 포함한다. ; The present invention provides a preprocessing process for an input image / image, a feature value generation process for image extraction, a process of classifying pre-learned results into a face and a non-face region, a process of extracting a face, and a process of learning a face. Include.

얼굴영역, 추출방법, AdaBoosting학습방법Facial area, extraction method, AdaBoosting learning method

Description

최적 얼굴 영역 추출방법 {Method for detecting a face and apparatus thereof } {Method for detecting a face and apparatus}

도1은 본 발명의 전체 구성을 도시한 구성도이다.1 is a block diagram showing the overall configuration of the present invention.

도2는 얼굴 추출에 방법에 관한 전체 흐름도이다.2 is an overall flowchart of a method for face extraction.

도3은 AdaBoosting 학습 방법에 관한 세부 흐름도이다.3 is a detailed flowchart of an AdaBoosting learning method.

도4는 본 발명에 따라 얼굴을 추출한 모습이다.4 is a view of extracting a face according to the present invention.

일반적으로 얼굴 추출은 얼굴 인식 과정에서 꼭 필요한 과정으로 정규화된 추출된 얼굴 영역을 사용하여 얼굴의 특징을 필터링 과정을 통하여 특징값을 만들어 얼굴 인식과정에 사용하게 된다. 얼굴 추출은 얼굴의 컬러정보를 얼굴영역을 찾는 방법이 대표적인 방법으로 이용된다. 그러나 컬러정보에 의한 얼굴 위치 추출은 영상을 획득한 카메라나 조명등의 영상획득 조건에 따라 성능이 많이 좌우된다. 특히, 조명에 따라 얼굴이나 피부색 값은 상당히 크게 변하며 영상획득 조건을 모를 경우, 얼굴색 영역만을 결정하기 위해 피부색의 값에 대한 범위를 결정하는데 어려움이 따른다. 또한, 배경 영역을 포함하면서 폭넓게 추출된 유사 피부색에 대해 얼굴 영역만을 결정하는 과정에도 많은 어려움이 따르게 된다.In general, face extraction is a necessary process in face recognition process, and the feature value is generated through filtering process of face feature using normalized extracted face area and used for face recognition process. In face extraction, a method of finding a face region from face color information is used as a typical method. However, the extraction of face position by color information is highly dependent on the image acquisition conditions such as the camera or the light that acquired the image. In particular, the face or skin color value changes considerably according to the illumination, and when the image acquisition condition is not known, it is difficult to determine the range of the skin color value to determine only the face color area. In addition, the process of determining only the face region for the similar skin color extracted in a wide range including the background region has a lot of difficulties.

본 발명은 상기의 문제점을 해결하기 위하여 고안한 것으로 영상을 획득 시 다양한 환경 조건에서도 빠르고 정확하게 얼굴을 추출하는 방법을 제공함을 목적으로 한다
The present invention was devised to solve the above problems, and an object thereof is to provide a method for quickly and accurately extracting a face even under various environmental conditions when an image is acquired.

상기의 목적을 달성하기 위하여 본 발명에 의한 얼굴 추출 방법은, 입력된 영상을 일정한 부영역(SubWindow)별로 각각의 화소값을 합하여 대표값을 추출하고 이 값을 이용하여 영상의 윤곽선 성분을 찾기 위한 필터를 이용하여 특징값을 추출하는 특징값 추출수단; 추출된 특징값을 이용하여 얼굴을 학습시키는 학습수단; 학습된 결과를 이용하여 얼굴과 얼굴이 아닌 영역을 구분하는 분류수단을 포함한다.In order to achieve the above object, the face extraction method according to the present invention comprises extracting a representative value by summing each pixel value of a predetermined sub region (SubWindow) of an input image and using this value to find a contour component of the image. Feature value extracting means for extracting feature values using a filter; Learning means for learning a face using the extracted feature values; It includes a classification means for distinguishing the face and the non-face area by using the learned results.

이하에서 본 발명에 적용된 기본적인 개념을 설명한다.Hereinafter, a basic concept applied to the present invention will be described.

본 발명에서는 컬러 영상, 흑백 영상, 배경이 복잡한 영상등 기존의 제한적인 조건의 영상 입력을 통하여 얼굴 추출을 하는 것과 달리 제한 없는 조건의 영상을 가지고 입력된 영상을 일정한 부영역으로 분활 후에 이 영역이 포함하고 있는 화소값을 모두 합하여 그 영역의 대표값으로 추출한다. 이와 같은 과정은 기존의 영상 처리에서 각각의 화소값을 사용하는 과정보다 연산량이 적어지고 국부적인 잡음을 제거 하는 효과를 가지고 있다. 얼굴 영역의 특징값을 추출하기 위해 이 영역의 대표값을 이용하여 이웃하는 영역과 상하 좌우 대각선 방향으로 각각 대표값의 차이를 계산한다. 이 특징값을 이용하여 학습 수단을 통하여 얼굴영역과 비얼굴 영역을 학습시키게 된다. 학습을 시키기 위해서는 먼저 얼굴이 있는 영상 집합과 얼굴이 없는 영상 집합을 구성해야 한다. 학습 시킬 데이터의 영상의 크기는 정규화의 과정을 거쳐 동일한 크기로 조절한다. 이때 얼굴 영상 집합은 정면 얼굴, 회전된 얼굴, 조명조건이 다른 얼굴, 손등으로 얼굴의 일부 영역을 가린 얼굴등 다양한 촬영 환경 조건에서도 얼굴을 추출할 수 있도록 구성한다. In the present invention, unlike extracting a face through input of a limited condition such as a color image, a black and white image, and a complex background image, the region is divided into a predetermined subregion after splitting the input image with an image of an unrestricted condition. All pixel values included are summed and extracted as a representative value of the area. This process requires less computation and removes local noise than conventional pixel processing. In order to extract the feature values of the face area, the difference between the representative values is calculated in the vertical direction of the neighboring area and the diagonal direction. The feature value is used to train the face region and the non-face region through the learning means. In order to learn, we need to construct an image set with a face and an image set without a face. The image size of the data to be learned is adjusted to the same size through a process of normalization. In this case, the face image set is configured to extract a face even under various shooting environment conditions such as a face face, a rotated face, a face with different lighting conditions, and a face covering a part of the face with the back of the hand.

학습 방법은 AdaBoosting방법을 사용하였다. 이 방법은 노이즈가 없는 데이터에 대해서 더 강한 학습 성능을 보이므로 학습시킬 영상에 대해서 노이즈 제거 필터 처리를 한 후에 학습시켰다. AdaBoosting방법의 성능은 다음과 같이 세가지 요소에 의해 결정된다. The learning method was AdaBoosting method. Since this method shows stronger learning performance on noise-free data, it is trained after the noise reduction filter processing on the image to be learned. The performance of the AdaBoosting method is determined by three factors:

첫째, 얼굴 특징값을 모든 부영역에서 계산하여 단계별로 이용시 첫번째 약학습자(Weak Learner)에서 어떤 특징값을 사용하는 여부에 따라 성능과 수행시간이 결정된다. 따라서 본 발명에서는 얼굴 영역 안에 있는 눈의 존재가 주위 얼굴색 영역과 밝기 차이가 큰 사실을 이용하여 눈 주위의 특징값을 이용하여 학습하게 된다.First, when facial feature values are calculated in all sub-areas and used in stages, performance and execution time are determined by which feature value is used in the first Weak Learner. Therefore, in the present invention, using the fact that the presence of the eye in the face region has a large difference in brightness from the surrounding face color region, the feature value around the eye is learned.

둘째, 연속된 단계(Cascade)로 처리하는 구조 속에서 위의 단계로 가는 임계치 값을 어떻게 설정하는가에 여부에 따라 성능이 좌우된다. 얼굴이 아닌 영역을 얼굴이라고 찾는 오류비율과 얼굴을 얼굴이 아닌 영역이라고 찾는 오류비율의 합이 최소 가 되는 지점을 실험을 통하여 결정하였다.Second, performance depends on how to set the threshold value going to the above step in the structure processed by cascade. Through experiments, the point where the sum of the error rate that finds the non-face area as the face and the error rate that finds the face as the non-face area is minimized.

셋째, 연속된 단계를 몇 단계로 구성하는 여부에 따라 성능이 좌우된다. 이것 역시 실험을 통하여 오류가 최소가 되는 지점을 결정하였다. Third, the performance depends on how many steps the successive steps are made. This also experimented to determine the point of minimum error.

위와 같이 세가지 요소를 조합하여 미리 학습된 결과를 이용하여 얼굴 추출 과정을 하게 된다. 얼굴 추출 과정에서 입력된 이미지의 영상크기가 모두 다르기 때문에 부영역으로 분활하는 과정에서 첫 기본 부영역 크기를 가로 24pixel, 세로 24pixel로 시작해서 부영역 크기를 일정한 비율로 늘이거나 줄여서 대표값을 추출하게 된다. 이와 같은 방법으로 에러가 최소화 될 때까지 수렴하도록 학습을 시킨다.By combining the three factors as above, the face extraction process is performed using the pre-learned results. Since the image size of the input image is different in the face extraction process, in the process of dividing into subregions, the first basic subregion size starts with 24 pixels horizontally and 24 pixels vertically, and then the representative value is extracted by increasing or decreasing the subregion size by a constant ratio. do. In this way, we learn to converge until the error is minimized.

도3에서는 학습시키는 과정과 학습된 결과를 이용하여 얼굴과 비얼굴을 분리하는 과정을 도시하였다. 단계별로 이루어진 분류기(Classifier)를 사용하여 최종 단계에서 판정된 얼굴이 얼굴로 판단한다.3 illustrates a process of separating a face and a non-face using a learning process and a learned result. The face determined in the final step is determined to be a face using a classifier composed of steps.

본 발명은 다양한 영상획득 조건에서도 안정적이고 빠르게 얼굴을 검출해 내서 PC기반의 웹카메라나 핸드폰에 부착된 카메라와 같이 간단한 장치나 영상의 품질이 떨어지는 환경에서도 일반 카메라와 같은 동일한 성능을 보이게 된다. The present invention detects a face stably and quickly even under various image acquisition conditions, and thus shows the same performance as a general camera even in a simple device or an environment in which image quality is poor, such as a camera attached to a PC-based web camera or a mobile phone.

본 발명에 따른 시스템 및 방법은 사람의 행동을 이해하고 그에 따라 적당히 반응하는컴퓨터 시스템, 컴퓨터나 건물의 출입을 제어하는 감시시스템등 얼굴 영역 추출이 요구되는 분야에서 효과적이면서 최적화된 방법을 제공한다.The system and method according to the present invention provide an effective and optimized method in areas where facial area extraction is required, such as a computer system that understands human behavior and responds accordingly, and a surveillance system that controls access to a computer or building.

Claims

원 영상의 밝기 보정 및 노이즈 제거 처리를 위한 전처리 과정, 전처리 된 원 영상을 일정한 영역으로 분활하는 과정, 분할된 영역의 분할된 각 영역의 특징정보를 추출하는 과정 이 특징값을 이용하여 학습하는 과정, 학습된 결과로 얼굴 영역을 검출하는 과정을 포함하여 이루어지는 것을 특징으로 하는 최적 얼굴 영역추출 방법.The process of preprocessing for brightness correction and noise reduction of the original image, the process of dividing the preprocessed original image into a certain region, and extracting feature information of each segmented region of the segmented region. And detecting a face region as a result of the learning.

(a)원 영상 전처리 과정(a) Raw image preprocessing

(b)원 영상 일정한 영역 분활 과정(b) Raw image constant region segmentation process

(c)분활된 영역 이용 특징값 추출과정(c) Process of Extracting Feature Values Using Segmented Regions

(d)특징값 이용 학습하는 과정(d) The process of learning to use feature values

(e)학습된 결과를 이용 얼굴 영역을 검출하는 과정(e) The process of detecting facial areas using the learned results

제2항에 있어서,The method of claim 2,

(a)단계에서 영상을 전처리 하는 과정에서 영상의 화소값 히스토그램을 이용하여 균등한 밝기 영역으로 히스토그램을 조정하는 방법과 영상 노이즈를 제거하기 위하여 다운 샘플링하는 방법In the process of pre-processing the image in the step (a), the histogram is adjusted to the uniform brightness region using the pixel value histogram of the image, and the method of down sampling to remove the image noise

제2항에 있어서,The method of claim 2,

(b)단계에서 원 영상을 일정한 영역으로 분활 후에 그 분활 영역의 화소값을 모두 더한값으로 영역의 대표값을 구하는 방법In step (b), after dividing the original image into a predetermined region, a method of obtaining a representative value of the region by adding up pixel values of the divided region

제2항에 있어서,The method of claim 2,

(c)단계에서는 (b)단계에서 구한 분활 영역의 대표값을 이용하여 특징값을 구하려는 영역과 이웃하는 상하 좌우 대각의 방향에 위치한 영역에서 대표값의 차를 구하여 특징값을 추출하는 방법In the step (c), using the representative value of the division region obtained in the step (b), the feature value is extracted by obtaining the difference between the representative value in the region located in the direction of the up, down, left, and right diagonal directions adjacent to the region to obtain the feature value.

제2항에 있어서,The method of claim 2,

(d)단계에서는 (c)단계에서 추출한 특징값을 이용 얼굴과 비얼굴 영역을 구분하는 학습을 AdaBoosting학습법을 이용하여 학습시키는 방법In the step (d), the method of classifying the face and the non-face area using the feature value extracted in the step (c) is performed using the AdaBoosting learning method.

제6항에 있어서,The method of claim 6,

학습을 시킬 영상들은 동일한 수로 얼굴이 있는 영상집합과 얼굴이 없는 영상집합으로 구분하여 구성하고 얼굴이 있는 영상집합은 정면 얼굴, 회전된 얼굴, 손등으로 얼굴의 일부 영역을 가린 얼굴, 조명조건을 달리한 얼굴로 구성하는 방법The images to be trained are divided into the same number of face sets and the faceless set of images, and the faced image sets are the face covered by the front face, rotated face, hands, etc. How to make up your face

제2항에 있어서,The method of claim 2,

(e)단계에서 (d)단계에서 학습된 결과를 이용하여 얼굴 영역을 분류하는 방법How to classify face areas using the results learned in step (e)

제8항에 있어서,The method of claim 8,

얼굴과 비얼굴을 구분하는 각 단계별 분류기에서 역치값(Threshold)은 단계별로 사용된 특징값들의 평균을 이용하여 사용하는 방법Threshold is used in each stage classifier that distinguishes face and non-face using average of feature values used in each stage.