KR102563522B1

KR102563522B1 - Apparatus, method and computer program for recognizing face of user

Info

Publication number: KR102563522B1
Application number: KR1020180015791A
Authority: KR
Inventors: 문일현; 박진욱; 최인환
Original assignee: 주식회사 케이티
Priority date: 2018-02-08
Filing date: 2018-02-08
Publication date: 2023-08-04
Also published as: KR20190100529A

Abstract

영상으로부터 사용자의 얼굴을 인식하는 장치는 영상을 입력받는 입력부, 상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하는 얼굴 특징 포인트 검출부, 복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하는 결과값 도출부, 상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하는 에러값 도출부 및 상기 도출된 에러값을 반영하여 상기 얼굴을 인식하는 얼굴 인식부를 포함한다. An apparatus for recognizing a user's face from an image includes an input unit that receives an image, a facial feature point detector that extracts an image including the user's face included in the image, and detects a feature point of the user's face from the extracted image; A result value derivation unit for deriving a result value for the detected facial feature point using a plurality of learning models, an error value derivation unit for deriving an error value for each learning model based on the derived result value, and the derivation and a face recognition unit for recognizing the face by reflecting the received error value.

Description

사용자의 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램{APPARATUS, METHOD AND COMPUTER PROGRAM FOR RECOGNIZING FACE OF USER}Device, method and computer program for recognizing user's face {APPARATUS, METHOD AND COMPUTER PROGRAM FOR RECOGNIZING FACE OF USER}

본 발명은 사용자의 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램에 관한 것이다. The present invention relates to an apparatus, method and computer program for recognizing a user's face.

사람과 상호작용할 수 있는 기기, 예를 들어, 컴퓨터, 스마트폰 등이 널리 보급됨에 따라 사람과 기기 간의 자연스러운 상호작용 인터페이스(NUI, Natural User Interface) 기술에 대한 연구가 활발하게 진행되고 있다. NUI 기술 중 하나로서 얼굴을 이용하는 인터페이스는 자연스럽고 직관적인 상호작용이 가능하다는 장점을 가지고 있으며, HCI(Human-Computer Interaction), HRI(Human-Robot Interaction), HMI(Human-Machine Interaction) 분야 등에서 이용되고 있다. As devices capable of interacting with humans, such as computers and smart phones, become widely available, research on natural user interface (NUI) technologies between humans and devices is being actively conducted. As one of the NUI technologies, interfaces using faces have the advantage of enabling natural and intuitive interactions, and are used in the fields of HCI (Human-Computer Interaction), HRI (Human-Robot Interaction), and HMI (Human-Machine Interaction). It is becoming.

이러한 얼굴을 이용하는 인터페이스와 관련하여, 선행기술인 한국등록특허 제 10-179556호는 얼굴 인식과 얼굴 모션 패턴 인식을 통한 개인 인증 방법 및 장치를 개시하고 있다. Regarding such an interface using a face, Korea Patent Registration No. 10-179556, which is a prior art, discloses a personal authentication method and device through face recognition and face motion pattern recognition.

종래에는 입력 영상에서 얼굴을 인식하기 위해 얼굴의 특징점을 추출할 수 있는 기술자를 설계한 후 추출된 특징을 비교할 수 있도록 거리 함수를 이용하였다. 이 경우, 기계학습에서 사용하기 적합한 특징 데이터를 추출하기 힘들다는 단점을 가지고 있었다. 이를 개선하고자, 딥러닝을 사용하여 얼굴의 특징점을 추출하는 경우, 학습에 사용되지 않은 얼굴에 대한 특징점의 추출 시 얼굴의 특징이 잘 표현되고, 성능이 좋은 특징 데이터를 추출하기 어렵다는 단점을 가지고 있었다. Conventionally, in order to recognize a face from an input image, a descriptor capable of extracting feature points of a face is designed, and then a distance function is used to compare the extracted features. In this case, it had a disadvantage that it was difficult to extract feature data suitable for use in machine learning. In order to improve this, when extracting facial feature points using deep learning, the facial features are well expressed and it is difficult to extract feature data with good performance when extracting facial feature points that have not been used for learning. .

기존의 얼굴 분류기를 이용한 학습 방법의 경우, 주어진 학습 데이터의 하나의 조건만 고려됨으로써, 사용자의 얼굴 변화에 따라 에러가 발생하였으나, 다양한 조건을 고려하여 에러를 최소화시킬 수 있도록 하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공하고자 한다. 딥러닝을 이용한 학습을 통해 다양한 조명 변화 및 포즈 변화에 강인한 특징 데이터를 추출할 수 있는 학습 모델을 생성하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공하고자 한다. 딥러닝을 이용한 얼굴의 특징 포인트 데이터의 추출기의 설계 시, 소프트맥스외에 다기능 손실함수(Multi-Object Loss Function)를 함께 이용하여 얼굴 특징 포인트의 데이터를 추출하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공하고자 한다. 다만, 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다. In the case of the learning method using the existing face classifier, only one condition of the given learning data is considered, and an error occurs according to the change of the user's face, but a face recognition device that minimizes the error by considering various conditions. , methods and computer programs. It is intended to provide a device, method, and computer program for recognizing a face that generates a learning model capable of extracting feature data robust to various lighting changes and pose changes through learning using deep learning. When designing an extractor of facial feature point data using deep learning, a face recognition device, method, and computer program for extracting facial feature point data using a multi-object loss function in addition to softmax want to provide However, the technical problem to be achieved by the present embodiment is not limited to the technical problems described above, and other technical problems may exist.

상술한 기술적 과제를 달성하기 위한 수단으로서, 본 발명의 일 실시예는, 영상을 입력받는 입력부, 상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하는 얼굴 특징 포인트 검출부, 복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하는 결과값 도출부, 상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하는 에러값 도출부 및 상기 도출된 에러값을 반영하여 상기 얼굴을 인식하는 얼굴 인식부를 포함하는 얼굴 인식 장치를 제공할 수 있다. As a means for achieving the above-described technical problem, an embodiment of the present invention extracts an image including an input unit for receiving an image, a user's face included in the image, and the user's facial features from the extracted image. A facial feature point detection unit that detects points, a result value derivation unit that derives result values for the detected facial feature points using a plurality of learning models, and an error value for each learning model based on the derived result values. It is possible to provide a face recognition device including an error value derivation unit for deriving and a face recognition unit for recognizing the face by reflecting the derived error value.

본 발명의 다른 실시예는, 영상을 입력받는 단계, 상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하는 단계, 복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하는 단계, 상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하는 단계 및 상기 도출된 에러값을 반영하여 상기 얼굴을 인식하는 단계를 포함하는 얼굴 인식 방법을 제공할 수 있다. Another embodiment of the present invention includes receiving an image, extracting an image including a user's face included in the image, detecting feature points of the user's face from the extracted image, and selecting a plurality of learning models. deriving a resultant value for the detected facial feature point using the detected facial feature point, deriving an error value for each learning model based on the derived resultant value, and recognizing the face by reflecting the derived error value A face recognition method comprising the steps may be provided.

본 발명의 또 다른 실시예는, 컴퓨팅 장치에 의해 실행될 경우, 영상을 입력받고, 상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하고, 복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하고, 상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하고, 상기 도출된 에러값을 반영하여 상기 얼굴을 인식하도록 하는 명령어들의 시퀀스를 포함하는 매체에 저장된 컴퓨터 프로그램을 제공할 수 있다. Another embodiment of the present invention, when executed by a computing device, receives an image, extracts an image including a user's face included in the image, and detects a user's facial feature point from the extracted image , Deriving a result value for the detected facial feature point using a plurality of learning models, deriving an error value for each learning model based on the derived result value, and reflecting the derived error value to A computer program stored on a medium containing a sequence of instructions for causing face recognition may be provided.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본 발명을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 기재된 추가적인 실시예가 존재할 수 있다.The above-described means for solving the problems is only illustrative and should not be construed as limiting the present invention. In addition to the exemplary embodiments described above, there may be additional embodiments described in the drawings and detailed description.

전술한 본 발명의 과제 해결 수단 중 어느 하나에 의하면, 기존의 얼굴 분류기를 이용한 학습 방법의 경우, 주어진 학습 데이터의 하나의 조건만 고려됨으로써, 사용자의 얼굴 변화에 따라 에러가 발생하였으나, 다양한 조건을 고려하여 에러를 최소화시킬 수 있도록 하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공할 수 있다. 딥러닝을 이용한 학습을 통해 다양한 조명 변화 및 포즈 변화에 강인한 특징 데이터를 추출할 수 있는 학습 모델을 생성하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공할 수 있다. 딥러닝을 이용한 얼굴의 특징 포인트 데이터의 추출기의 설계 시, 소프트맥스외에 다기능 손실함수(Multi-Object Loss Function)를 함께 이용하여 얼굴 특징 포인트의 데이터를 추출하는 얼굴을 인식하는 장치, 방법 및 컴퓨터 프로그램을 제공할 수 있다.According to any one of the above-described problem solving means of the present invention, in the case of the learning method using the existing face classifier, only one condition of the given learning data is considered, so an error occurs according to the change of the user's face, but various conditions It is possible to provide a device, method, and computer program for recognizing a face that can minimize an error by taking into account. It is possible to provide a device, method, and computer program for recognizing a face that generates a learning model capable of extracting feature data robust to various lighting changes and pose changes through learning using deep learning. When designing an extractor of facial feature point data using deep learning, a face recognition device, method, and computer program for extracting facial feature point data using a multi-object loss function in addition to softmax can provide.

도 1은 본 발명의 일 실시예에 따른 얼굴 인식 장치의 구성도이다.
도 2a 및 도 2b는 본 발명의 일 실시예에 따른 이미지 변환 행렬 및 파라미터 행렬을 도시한 예시적인 도면이다.
도 3은 본 발명의 일 실시예에 따른 얼굴 인식 장치에서 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하는 과정을 도시한 예시적인 도면이다.
도 4는 본 발명의 일 실시예에 따른 얼굴 인식 장치에서 영상으로부터 얼굴을 인식하는 방법의 순서도이다. 1 is a configuration diagram of a face recognition device according to an embodiment of the present invention.
2A and 2B are exemplary diagrams illustrating an image transformation matrix and a parameter matrix according to an embodiment of the present invention.
3 is an exemplary diagram illustrating a process of reflecting predetermined weights according to the importance of each learning model in the face recognition device according to an embodiment of the present invention.
4 is a flowchart of a method of recognizing a face from an image in a face recognition device according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail so that those skilled in the art can easily practice the present invention with reference to the accompanying drawings. However, the present invention may be embodied in many different forms and is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미하며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다. Throughout the specification, when a part is said to be "connected" to another part, this includes not only the case where it is "directly connected" but also the case where it is "electrically connected" with another element interposed therebetween. . In addition, when a part "includes" a certain component, this means that it may further include other components, not excluding other components, unless otherwise stated, and one or more other characteristics. However, it should be understood that it does not preclude the possibility of existence or addition of numbers, steps, operations, components, parts, or combinations thereof.

본 명세서에 있어서 '부(部)'란, 하드웨어에 의해 실현되는 유닛(unit), 소프트웨어에 의해 실현되는 유닛, 양방을 이용하여 실현되는 유닛을 포함한다. 또한, 1 개의 유닛이 2 개 이상의 하드웨어를 이용하여 실현되어도 되고, 2 개 이상의 유닛이 1 개의 하드웨어에 의해 실현되어도 된다.In this specification, a "unit" includes a unit realized by hardware, a unit realized by software, and a unit realized using both. Further, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware.

본 명세서에 있어서 단말 또는 디바이스가 수행하는 것으로 기술된 동작이나 기능 중 일부는 해당 단말 또는 디바이스와 연결된 서버에서 대신 수행될 수도 있다. 이와 마찬가지로, 서버가 수행하는 것으로 기술된 동작이나 기능 중 일부도 해당 서버와 연결된 단말 또는 디바이스에서 수행될 수도 있다.In this specification, some of the operations or functions described as being performed by a terminal or device may be performed instead by a server connected to the terminal or device. Likewise, some of the operations or functions described as being performed by the server may also be performed in a terminal or device connected to the corresponding server.

이하 첨부된 도면을 참고하여 본 발명의 일 실시예를 상세히 설명하기로 한다. Hereinafter, an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

본 발명에서는 각 얼굴의 클래스 오류를 최소화하기 위해 분류기를 통해 사용자의 얼굴 특징 포인트 데이터, 조명변화 데이터, 포즈변화 데이터, 타인의 얼굴 특징 포인트 데이터를 분류기에 입력으로 함으로써, 얼굴 인식 결과의 에러를 최소화시키도록 학습시키는 기술을 제안하고자 한다. In the present invention, in order to minimize the class error of each face, the user's facial feature point data, lighting change data, pose change data, and other person's facial feature point data are input to the classifier through the classifier, thereby minimizing the error of the face recognition result. I would like to suggest a technique to learn to do.

도 1은 본 발명의 일 실시예에 따른 얼굴 인식 장치의 구성도이다. 도 1을 참조하면, 얼굴 인식 장치(100)는 트레이닝부(110), 입력부(120), 얼굴 특징 포인트 검출부(130), 조명 성분 추출부(140), 결과값 도출부(150), 에러값 도출부(160), 총 손실률 도출부(170) 및 얼굴 인식부(180)를 포함할 수 있다. 1 is a configuration diagram of a face recognition device according to an embodiment of the present invention. Referring to FIG. 1 , the face recognition apparatus 100 includes a training unit 110, an input unit 120, a facial feature point detection unit 130, an illumination component extraction unit 140, a result value derivation unit 150, and an error value. It may include a derivation unit 160, a total loss rate derivation unit 170, and a face recognition unit 180.

트레이닝부(110)는 복수의 학습 모델을 이용하여 얼굴 특징 포인트를 트레이닝할 수 있다. 복수의 학습 모델은 사용자의 얼굴 학습 모델, 타인의 얼굴 학습 모델, 조명 학습 모델 및 포즈 학습 모델 중 적어도 어느 하나 이상을 포함할 수 있다. The training unit 110 may train facial feature points using a plurality of learning models. The plurality of learning models may include at least one or more of a user's face learning model, another person's face learning model, a lighting learning model, and a pose learning model.

예를 들어, 트레이닝부(110)는 복수의 학습 모델에 대한 학습 데이터로부터 검출된 얼굴의 특징 포인트를 기준으로 학습 데이터를 적어도 하나 이상의 영역으로 분할하여 얼굴 특징 포인트를 트레이닝할 수 있다. For example, the training unit 110 may divide the learning data into at least one area based on facial feature points detected from the learning data for a plurality of learning models to train facial feature points.

트레이닝부(110)는 학습 데이터에 이용된 얼굴 특징 포인트의 개수만큼 버퍼에 각 얼굴 특징 포인트에 대한 대표 특징 벡터를 저장할 수 있다. 예를 들어, 얼굴의 특성상 동일 인물의 경우, 얼굴 특징 포인트의 데이터가 특정 벡터(클러스터의 센터)를 중심으로 형성되므로, 중심이 되는 대표 특징 벡터를 별도의 버퍼에 저장하여 이를 학습의 구심점이 되도록 할 수 있다. The training unit 110 may store representative feature vectors for each facial feature point in a buffer as many as the number of facial feature points used in the training data. For example, in the case of the same person due to the nature of the face, since the data of the facial feature point is formed around a specific vector (center of the cluster), the representative feature vector that is the center is stored in a separate buffer so that it is the center of learning. can do.

트레이닝부(110)는 동일 사용자의 다른 이미지를 정해진 입력 개수에 따라 학습 데이터로 입력하여, 얼굴 특징 포인트, 조명 변화에 대한 얼굴 특징 포인트, 포즈 변화에 대한 얼굴 특징 포인트에 대한 출력값을 획득할 수 있다. The training unit 110 may obtain output values for facial feature points, facial feature points for lighting changes, and facial feature points for pose changes by inputting other images of the same user as training data according to a predetermined number of inputs. .

이러한 트레이닝부(110)는 예를 들어, 심층 신경망(DNN, Deep Neural Network), 합성곱 신경망(CNN, Convolutional Neural Network), 순환 신경망(RNN, Recurrent Neural Network), 제한 볼츠만 머신(RBM, Restricted Boltzmann Machine), 심층 신뢰 신경망(DBN, Deep Belief Network), 심층 Q-네트워크(Deep Q-Networks)와 같은 다양한 딥러닝 기술들을 이용할 수 있으며, 이에 한정하지 않는다. The training unit 110 may include, for example, a deep neural network (DNN), a convolutional neural network (CNN), a recurrent neural network (RNN), a restricted boltzmann machine (RBM), Machine), deep belief network (DBN), and deep Q-networks (Deep Q-Networks) can be used, but is not limited thereto.

입력부(120)는 영상을 입력받을 수 있다. 영상은 예를 들어, 스마트폰, 블랙박스, CCTV 등에 의해 촬영되어 네트워크를 통해 입력된 영상일 수 있다.The input unit 120 may receive an image. The image may be, for example, an image captured by a smartphone, a black box, a CCTV, or the like and input through a network.

얼굴 특징 포인트 검출부(130)는 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출할 수 있다. The facial feature point detection unit 130 may extract an image including the user's face included in the image and detect the user's facial feature points from the extracted image.

조명 성분 추출부(140)는 영상으로부터 조명 성분을 추출할 수 있다. 이는, 조명 변화에 강인한 얼굴 특징 포인트를 학습시키기 위해 추출된 조명 성분의 시뮬레이션을 통해 학습 데이터를 증가시키기 위함이다. The illumination component extractor 140 may extract an illumination component from an image. This is to increase learning data through simulation of extracted lighting components in order to learn facial feature points that are robust to lighting changes.

결과값 도출부(150)는 복수의 학습 모델을 이용하여 검출된 얼굴 특징 포인트에 대한 결과값을 도출할 수 있다. 이하에서는, 각 학습 모델을 이용하여 검출된 얼굴 특징 포인트에 대한 결과값을 도출되는 과정을 설명하도록 한다. The result value deriving unit 150 may derive result values for facial feature points detected by using a plurality of learning models. Hereinafter, a process of deriving result values for facial feature points detected using each learning model will be described.

결과값 도출부(150)는 사용자의 얼굴 학습 모델을 이용하여, 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 사용자의 복수의 다른 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다. 예를 들어, 결과값 도출부(150)는 사용자의 얼굴 학습 모델을 이용하여 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 동일 클래스 내에서 무작위로 선택된 다른 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다. The result value derivation unit 150 may derive a result value through comparison between the user's facial feature points detected from the image and the user's facial feature points detected from a plurality of other images of the user, using the user's face learning model. can For example, the result value deriving unit 150 compares the user's facial feature points detected from an image using the user's face learning model and the user's facial feature points detected from other randomly selected images within the same class. results can be derived.

결과값 도출부(150)는 타인의 얼굴 학습 모델을 이용하여, 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 타인의 이미지로부터 검출된 타인의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다. 예를 들어, 결과값 도출부(150)는 다른 사용자의 얼굴과의 분류 성능을 높이기 위해 타인의 얼굴 학습 모델을 이용하여 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 다른 클래스에서 무작위로 선택된 타인의 이미지로부터 검출된 타인의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다. The result value derivation unit 150 may derive a result value through comparison between the facial feature points of the user detected from the image and the facial feature points of the other person detected from the image of the other person using the face learning model of the other person. For example, the resulting value derivation unit 150 uses a face learning model of another user to improve classification performance with the face of another user, and randomly selects another person's image from a different class from the user's facial feature point detected from the image. A result value may be derived through comparison between the detected facial feature points of another person.

결과값 도출부(150)는 조명 학습 모델을 이용하여, 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 조명 성분을 변화시킨 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다. 예를 들어, 결과값 도출부(150)는 조명 성분 추출부(140)에서 추출된 조명 성분이 변화되도록 시뮬레이션하여, 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 조명 성분을 변화시킨 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출할 수 있다.The result value derivation unit 150 may derive a result value through comparison between the user's facial feature points detected from the image and the user's facial feature points detected from the image in which the illumination component is changed using the lighting learning model. . For example, the result value deriving unit 150 simulates the lighting component extracted by the lighting component extracting unit 140 to change, and the facial feature points of the user detected from the image and the user detected from the image in which the lighting component is changed A result value can be derived through comparison between facial feature points of .

결과값 도출부(150)는 포즈 학습 모델을 이용하여 이미지로부터 검출된 사용자의 얼굴 특징 포인트에 기초하여 이미지 변환 행렬을 생성하고, 생성된 이미지 변환 행렬을 이용하여 포즈 시뮬레이션을 통해 추출된 이미지를 2차원 이미지로 변환하고, 변환된 2차원 이미지에 대한 포즈 시뮬레이션의 결과값을 도출할 수 있다. The resulting value derivation unit 150 generates an image conversion matrix based on the facial feature points of the user detected from the image using the pose learning model, and converts the image extracted through pose simulation using the generated image conversion matrix into 2 It is converted into a dimensional image, and a result value of pose simulation for the converted 2D image can be derived.

결과값 도출부(150)는 포즈 학습 모델을 이용하여 포즈 시뮬레이션의 결과값을 도출하기 위해 호모그래피(homography) 변환 방식을 이용할 수 있다. 호모그래피는 한 평면을 다른 평면에 투영시켰을 때, 투영된 대응점들 사이에서 성립된 일정한 변환 관계를 의미한다. 예를 들어, 결과값 도출부(150)는 호모그래피 변환 방식을 이용하여 이미지의 회전/스케일/평행이동/투영 변환을 수행하고, 카메라로부터 획득할 수 있는 3차원 얼굴의 이미지를 2차원 영상에 대한 변환을 포즈 시뮬레이션하여 포즈 시뮬레이션의 결과값을 도출할 수 있다. 이와 관련하여, 이미지 변환 행렬을 생성하는 과정에 대해서는 도 2a 및 도 2b를 통해 상세히 설명하도록 한다. The result value derivation unit 150 may use a homography conversion method to derive a result value of pose simulation using a pose learning model. Homography means a constant transformation relationship established between projected corresponding points when one plane is projected onto another plane. For example, the result value deriving unit 150 performs rotation/scale/translation/projection conversion of an image using a homography conversion method, and converts a 3D face image obtainable from a camera into a 2D image. The resulting value of the pose simulation may be derived by pose simulation of the transform for the . In this regard, a process of generating an image transformation matrix will be described in detail with reference to FIGS. 2A and 2B.

도 2a 및 도 2b는 본 발명의 일 실시예에 따른 이미지 변환 행렬 및 파라미터 행렬을 도시한 예시적인 도면이다. 2A and 2B are exemplary diagrams illustrating an image transformation matrix and a parameter matrix according to an embodiment of the present invention.

도 2a는 본 발명의 일 실시예에 따른 이미지 변환 행렬을 도시한 예시적인 도면이다. 결과값 도출부(150)는 다양한 포즈 변화에 대응하기 위해 이미지 변환 행렬을 생성할 수 있다. 또한, 3차원 공간에서 발생될 수 있는 이미지 변환을 시뮬레이션을 통해 생성할 수 있다.2A is an exemplary diagram illustrating an image transformation matrix according to an embodiment of the present invention. The result value deriving unit 150 may generate an image conversion matrix to correspond to various pose changes. In addition, image conversion that can occur in a 3D space can be generated through simulation.

도 2a를 참조하면, 이미지 변환 행렬의 생성에 이용되는 파라미터 행렬은 A₂, T, R, A₁을 포함할 수 있다. 예를 들어, A₂는 3D에서 2D로 변환하기 위한 행렬을 포함하고, T는 X, Y, Z 축에 대한 변환 행렬을 포함하고, R은 X, Y, Z 축에 대한 회전을 나타내고, A₂는 2D에서 3D로 변환하기 위한 행렬을 포함할 수 있다. Referring to FIG. 2A , a parameter matrix used to generate an image transformation matrix may include A ₂ , T, R, and A ₁ . For example, A ₂ contains a matrix for transforming from 3D to 2D, T contains a transformation matrix about the X, Y, and Z axes, R represents a rotation about the X, Y, and Z axes, and A ₂ may contain a matrix for converting from 2D to 3D.

도 2b는 본 발명의 일 실시예에 따른 이미지 변환 행렬에 기초하여 계산된 결과를 도시한 예시적인 도면이다. 2B is an exemplary diagram illustrating a result calculated based on an image transformation matrix according to an embodiment of the present invention.

이 때, 다른 입력의 경우, 입력자가 일정 범위에서 주어질 수 있지만, 마지막 카메라의 초점거리 f값의 경우, 현재 사용하는 카메라의 초점거리와 정확하게 일치하지 않아도 결과에 큰 영향을 주지 않으므로, 너무 크거나 작은 값을 제외하고 100~300 사이의 값을 선택하여 진행될 수 있다. At this time, in the case of other inputs, the input can be given within a certain range, but in the case of the focal length f-value of the last camera, even if it does not exactly match the focal length of the currently used camera, it does not significantly affect the result, so it is too large or Excluding small values, it can be performed by selecting a value between 100 and 300.

다시 도 1로 돌아와서, 에러값 도출부(160)는 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출할 수 있다. Returning to FIG. 1 again, the error value deriving unit 160 may derive an error value for each learning model based on the derived result value.

에러값 도출부(160)는 사용자의 얼굴 학습 모델 또는 타인의 얼굴 학습 모델을 도출된 결과값 및 도출된 결과값이 상기 사용자일 확률에 기초하여 에러값을 도출할 수 있다. 이를 위해, 에러값 도출부(160)는 수학식 1을 이용하여 에러값을 도출할 수 있다. The error value derivation unit 160 may derive an error value based on a result value derived from the user's face learning model or another person's face learning model and a probability that the derived result value is the user. To this end, the error value deriving unit 160 may derive an error value using Equation 1.

수학식 1은 사용자의 학습 모델을 이용한 출력값, 타인의 학습 모델을 이용한 출력값 및 조명 학습 모델을 이용한 출력값의 경우, 에러값(face_pair_loss)을 도출하기 위해 이용될 수 있다. Equation 1 may be used to derive an error value (face_pair _loss ) in the case of an output value using a user's learning model, an output value using another person's learning model, and an output value using a lighting learning model.

여기서, p값은 입력으로 주어진 페어(pair)에 학습 네트워크에서 출력할 확률을 의미하고, 페어(pair)는 정답과 계산된 값으로 구성된 입력 쌍을 의미한다. 페어(pair)는 사용자일 확률(positive pair) 또는 타인일 확률(negative pair)에 따라 0 또는 1로 gt값이 결정될 수 있다. 예를 들어, 사용자일 확률(positive pair)이 1이면, 타인일 확률(negative pair)은 0일 수 있다. 그리고 높은 확률을 보이는 출력값에 대해 별도의 버퍼에 저장된 대표 특징 벡터를 업데이트함으로써, 학습이 진행될수록 안정된 방향으로 학습되도록 가이드 역할을 제공할 수 있다. Here, the p value means the probability of outputting a pair given as an input from the learning network, and the pair means an input pair consisting of a correct answer and a calculated value. A gt value may be determined as 0 or 1 according to a probability of being a user (positive pair) or a probability of being someone else (negative pair). For example, if the probability of being a user (positive pair) is 1, the probability of being someone else (negative pair) may be 0. In addition, by updating a representative feature vector stored in a separate buffer for an output value showing a high probability, a guide role can be provided so that learning is performed in a stable direction as learning progresses.

에러값 도출부(160)는 카메라의 초점 거리에 대한 포즈 시뮬레이션의 결과값에 포함된 사용자의 포즈 추정 결과를 계산하여 에러값을 도출할 수 있다. 이를 위해, 에러값 도출부(160)는 수학식 2를 이용하여 에러값을 도출할 수 있다.The error value deriving unit 160 may derive an error value by calculating a result of estimating the user's pose included in the result of the pose simulation for the focal length of the camera. To this end, the error value deriving unit 160 may derive an error value using Equation 2.

수학식 2는 포즈 학습 모델을 이용한 출력값의 경우, 에러값(pose_loss)을 도출하기 위해 이용될 수 있다. Equation 2 may be used to derive an error value (pose _loss ) in the case of an output value using a pose learning model.

포즈 변화에 대한 입력은 얼굴의 n개의 포인트로 구성되며, 포즈 시뮬레이션 결과값의 출력 중 동일 인물 2개 쌍의 포즈 추정결과를 수학식 2에 입력하여 에러값을 계산할 수 있다. 여기서, L2-norm은 점 간의 거리를 구하는 유클리디안(Euclidean) 거리 함수를 이용하여 계산될 수 있다. The input for the pose change is composed of n points of the face, and an error value can be calculated by inputting the pose estimation result of two pairs of the same person among the output of the pose simulation result value to Equation 2. Here, the L2-norm may be calculated using a Euclidean distance function for obtaining a distance between points.

여기서, 수학식 1 및 수학식 2는 Metric(f(original)-f(positive))+Metric(f(original)-f(light_simulated)+Metric(f(original)-f(pose_simulated)-Metric(f(original)-f(negative))에 기초하여 도출되는 수식일 수 있다. Metric(f(original)-f(positive))는 오리지날 이미지와 사용자 이미지의 차를 의미하고, Metric(f(original)-f(light_simulated)는 오리지날 이미지와 조명 변화 이미지의 차를 의미하고, Metric(f(original)-f(pose_simulated)는 오리지날 이미지와 포즈 변화된 이미지의 차를 의미하고, Metric(f(original)-f(negative))는 오리지날 이미지와 타인의 이미지의 차를 의미하는 것일 수 있다. Here, Equations 1 and 2 are Metric(f(original)-f(positive))+Metric(f(original)-f(light_simulated)+Metric(f(original)-f(pose_simulated)-Metric(f (original)-f(negative)) Metric(f(original)-f(positive)) means the difference between the original image and the user image, Metric(f(original)- f(light_simulated) means the difference between the original image and the changed lighting image, Metric(f(original)-f(pose_simulated) means the difference between the original image and the pose-changed image, and Metric(f(original)-f( negative)) may mean the difference between the original image and another person's image.

이를 통해, 동일 클래스의 다른 이미지들 간의 거리를 최소화하고, 다른 클래스의 얼굴과의 거리는 최대로 할 수 있도록 할 수 있다. Through this, it is possible to minimize the distance between other images of the same class and maximize the distance to faces of different classes.

총 손실률 도출부(170)는 도출된 에러값에 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하여 총 손실률을 도출할 수 있다. 예를 들어, 총 손실률 도출부(170)는 다음의 수학식 3과 같이 각각의 도출된 값에 대한 에러값에 대해 가중치를 곱한 총 합이 최소화될 수 있도록, 중요도에 따라 가중치를 곱하여 총 손실률을 도출할 수 있다 The total loss rate derivation unit 170 may derive a total loss rate by reflecting a preset weight according to the importance of each learning model to the derived error value. For example, the total loss rate derivation unit 170 calculates the total loss rate by multiplying the weighted value according to the importance so that the total sum obtained by multiplying the weighted error values for each derived value can be minimized as shown in Equation 3 below. can derive

여기서, pos_pair scale은 사용자 학습 모델에 대한 가중치이고, loss_pospair는 사용자 학습 모델을 이용하여 도출된 에러값을 나타내고, light_sim scale은 조명 학습 모델에 대한 가중치이고, loss_lightsim은 조명 학습 모델을 이용하여 도출된 에러값을 나타내고, pos_sim scale은 포즈 학습 모델에 대한 가중치이고, loss_posesim은 포즈 학습 모델을 이용하여 도출된 에러값을 나타내고, neg_pair scale은 타인의 학습 모델에 대한 가중치이고, loss_negpair는 타인의 학습 모델을 이용하여 도출된 에러값을 나타내는 것일 수 있다. Here, pos_pair scale is the weight for the user learning model, loss _pospair represents the error value derived using the user learning model, light_sim scale is the weight for the lighting learning model, and loss _lightsim is derived using the lighting learning model represents the error value obtained, pos_sim scale is a weight for the pose learning model, loss _posesim represents an error value derived using the pose learning model, neg_pair scale is a weight for another person's learning model, and loss _negpair is another person's It may represent an error value derived using a learning model.

총 손실률 도출부(170)는 각 학습 모델을 이용하여 도출된 에러값에 가중치를 곱함으로써, 역전파 학습에 오류를 반영하여 추가적인 학습이 진행되도록 할 수 있다. The total loss rate deriving unit 170 may allow additional learning to proceed by reflecting the error in backpropagation learning by multiplying the error value derived using each learning model by a weight.

전체 에러값을 계산하기 위해 계산된 각 페어(pair)의 에러를 원하는 학습 가중치를 설정하여 곱한 후, 그 합을 네트워크 각 노드의 가중치에 반영하여 학습을 진행할 수 있다. 일반적으로 학습 가중치는 학습자가 임의로 선택할 수 있으며, 본 발명에서는 주어진 페어(pair)의 중요도를 고려하여 다음과 같이 학습하였을 때 가장 좋은 인식 결과를 획득할 수 있다.After setting and multiplying the errors of each pair calculated to calculate the total error value by setting a desired learning weight, the sum is reflected in the weight of each node in the network to proceed with learning. In general, the learning weight can be selected arbitrarily by the learner, and in the present invention, the best recognition result can be obtained when learning is performed as follows in consideration of the importance of a given pair.

도 3은 본 발명의 일 실시예에 따른 얼굴 인식 장치에서 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하는 과정을 도시한 예시적인 도면이다. 도 3을 참조하면, 총 손실률 도출부(170)는 각 학습 모델에 따라 기설정된 가중치를 반영할 수 있다. 예를 들어, 총 손실률 도출부(170)는 사용자의 얼굴 특징 포인트 데이터(300)에 기초하여 사용자의 학습 모델을 통해 도출된 얼굴 특징 포인트 데이터(310)에 대한 가중치를 2.0으로 반영하고, 조명 학습 모델을 통해 도출된 얼굴 특징 포인트 데이터(320)에 대한 가중치를 1.0으로 반영하고, 포즈 학습 모델을 통해 도출된 얼굴 특징 포인트 데이터(330)에 대한 가중치를 1.0으로 반영하고, 타인의 학습 모델을 통해 도출된 얼굴 특징 포인트 데이터(340)에 대한 가중치를 3.0으로 반영할 수 있다. 3 is an exemplary diagram illustrating a process of reflecting predetermined weights according to the importance of each learning model in the face recognition device according to an embodiment of the present invention. Referring to FIG. 3 , the total loss rate deriving unit 170 may reflect preset weights according to each learning model. For example, the total loss rate derivation unit 170 reflects the weight of the facial feature point data 310 derived through the user's learning model based on the user's facial feature point data 300 as 2.0, and light learning. The weight of the facial feature point data 320 derived through the model is reflected as 1.0, the weight of the facial feature point data 330 derived through the pose learning model is reflected as 1.0, and the weight of the facial feature point data 330 derived through the pose learning model is reflected as 1.0. The weight of the derived facial feature point data 340 may be reflected as 3.0.

여기서, 가중치는 학습자의 학습 데이터의 구성과 학습방법에 따라 다양하게 설정 가능하다. 기본적으로 얼굴 인식을 위한 네트워크 학습을 하기 때문에 사용자의 페어(pos pair) 및 타인의 페어(negative pair)의 가중치를 높게 주는 것이 좋으며, 포즈 변화 및 조명의 시뮬레이션 출력값에 대한 가중치는 낮게 주는 것이 얼굴 인식에 보다 정확한 결과가 도출되도록 할 수 있다. Here, the weight can be set in various ways according to the configuration and learning method of the learner's learning data. Basically, since network learning for face recognition is performed, it is recommended to give high weights to the user's pair (pos pair) and other people's pairs (negative pair), and to give low weights to the simulated output values of pose change and lighting for face recognition. can lead to more accurate results.

다시 도 2로 돌아와서, 얼굴 인식부(180)는 도출된 에러값을 반영하여 얼굴을 인식할 수 있다. 또한, 얼굴 인식부(180)는 도출된 총 손실률을 반영하여 얼굴을 인식할 수 있다. Returning to FIG. 2 again, the face recognition unit 180 may recognize a face by reflecting the derived error value. Also, the face recognition unit 180 may recognize a face by reflecting the derived total loss rate.

이러한 얼굴 인식 장치(100)는 영상으로부터 얼굴을 인식하는 명령어들의 시퀀스를 포함하는 매체에 저장된 컴퓨터 프로그램에 의해 수행될 수 있다. 컴퓨터 프로그램은 컴퓨팅 장치에 의해 실행될 경우, 영상을 입력받고, 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하고, 복수의 학습 모델을 이용하여 검출된 얼굴 특징 포인트에 대한 결과값을 도출하고, 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하고, 도출된 에러값을 반영하여 얼굴을 인식하도록 하는 명령어들의 시퀀스를 포함할 수 있다. The face recognition apparatus 100 may be implemented by a computer program stored in a medium including a sequence of instructions for recognizing a face from an image. When executed by a computing device, the computer program receives an image, extracts an image including the user's face included in the image, detects the user's facial feature points from the extracted image, and uses a plurality of learning models It may include a sequence of instructions for deriving result values for the detected facial feature points, deriving an error value for each learning model based on the derived result values, and reflecting the derived error values to recognize a face. there is.

도 4는 본 발명의 일 실시예에 따른 얼굴 인식 장치에서 영상으로부터 얼굴을 인식하는 방법의 순서도이다. 도 4에 도시된 얼굴 인식 장치(110)에서 영상으로부터 얼굴을 인식하는 방법은 도 1 내지 도 3에 도시된 실시예에 따른 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 1 내지 도 3에 도시된 실시예에 따른 얼굴 인식 장치(100)에서 영상으로부터 얼굴을 인식하는 방법에도 적용된다. 4 is a flowchart of a method of recognizing a face from an image in a face recognition device according to an embodiment of the present invention. The method of recognizing a face from an image in the face recognition apparatus 110 shown in FIG. 4 includes steps that are processed time-sequentially according to the embodiment shown in FIGS. 1 to 3 . Therefore, even if the content is omitted below, it is also applied to the method of recognizing a face from an image in the face recognition apparatus 100 according to the embodiment shown in FIGS. 1 to 3 .

단계 S410에서 얼굴 인식 장치(100)는 영상을 입력받을 수 있다. In step S410, the face recognition apparatus 100 may receive an image.

단계 S420에서 얼굴 인식 장치(100)는 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출할 수 있다. In step S420, the face recognition apparatus 100 may extract an image including the user's face included in the image and detect feature points of the user's face from the extracted image.

단계 S430에서 얼굴 인식 장치(100)는 복수의 학습 모델을 이용하여 검출된 얼굴 특징 포인트에 대한 결과값을 도출할 수 있다. In step S430, the face recognition apparatus 100 may derive result values for facial feature points detected by using a plurality of learning models.

단계 S440에서 얼굴 인식 장치(100)는 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출할 수 있다. In step S440, the face recognition apparatus 100 may derive an error value for each learning model based on the derived result value.

단계 S450에서 얼굴 인식 장치(100)는 도출된 에러값을 반영하여 얼굴을 인식할 수 있다. In step S450, the face recognition apparatus 100 may recognize a face by reflecting the derived error value.

상술한 설명에서, 단계 S410 내지 S450은 본 발명의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 전환될 수도 있다.In the foregoing description, steps S410 to S450 may be further divided into additional steps or combined into fewer steps, depending on an embodiment of the present invention. Also, some steps may be omitted as needed, and the order of steps may be switched.

도 1 내지 도 4를 통해 설명된 얼굴 인식 장치에서 영상으로부터 얼굴을 인식하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 컴퓨터 프로그램 또는 컴퓨터에 의해 실행 가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 또한, 도 1 내지 도 4를 통해 설명된 얼굴 인식 장치에서 영상으로부터 얼굴을 인식하는 방법은 컴퓨터에 의해 실행되는 매체에 저장된 컴퓨터 프로그램의 형태로도 구현될 수 있다. The method of recognizing a face from an image in the face recognition apparatus described with reference to FIGS. 1 to 4 may be implemented in the form of a computer program stored in a medium executed by a computer or a recording medium including instructions executable by a computer. there is. In addition, the method of recognizing a face from an image in the face recognition apparatus described with reference to FIGS. 1 to 4 may be implemented in the form of a computer program stored in a medium executed by a computer.

컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체를 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer readable media may include computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다. The above description of the present invention is for illustrative purposes, and those skilled in the art can understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present invention. will be. Therefore, the embodiments described above should be understood as illustrative in all respects and not limiting. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as distributed may be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. The scope of the present invention is indicated by the following claims rather than the detailed description above, and all changes or modifications derived from the meaning and scope of the claims and equivalent concepts should be construed as being included in the scope of the present invention. do.

100: 얼굴 인식 장치
110: 트레이닝부
120: 입력부
130: 얼굴 특징 포인트 검출부
140: 조명 성분 추출부
150: 결과값 도출부
160: 에러값 도출부
170: 총 손실률 도출부
180: 얼굴 인식부100: face recognition device
110: training unit
120: input unit
130: facial feature point detection unit
140: lighting component extraction unit
150: result value derivation unit
160: error value derivation unit
170: total loss rate derivation unit
180: face recognition unit

Claims

영상으로부터 사용자의 얼굴을 인식하는 장치에 있어서,
영상을 입력받는 입력부;
상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하는 얼굴 특징 포인트 검출부;
복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하는 결과값 도출부;
상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하는 에러값 도출부; 및
상기 도출된 에러값을 반영하여 상기 얼굴을 인식하는 얼굴 인식부를 포함하고,
상기 복수의 학습 모델은 사용자의 얼굴 학습 모델, 타인의 얼굴 학습 모델, 조명 학습 모델 및 포즈 학습 모델을 포함하고,
상기 각 학습 모델에 대해 도출된 에러값에 상기 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하여 총 손실률을 도출하는 총 손실률 도출부를 더 포함하는 것인, 얼굴 인식 장치.
An apparatus for recognizing a user's face from an image,
an input unit for receiving an image;
a facial feature point detection unit extracting an image including a face of a user included in the image and detecting a feature point of the user's face from the extracted image;
a result value derivation unit deriving result values for the detected facial feature points using a plurality of learning models;
an error value derivation unit for deriving an error value for each learning model based on the derived result value; and
A face recognition unit configured to recognize the face by reflecting the derived error value;
The plurality of learning models include a user's face learning model, another person's face learning model, a lighting learning model, and a pose learning model,
And a total loss rate derivation unit for deriving a total loss rate by reflecting a predetermined weight according to the importance of each learning model to the error value derived for each learning model.

제 1 항에 있어서,
상기 복수의 학습 모델을 이용하여 상기 얼굴 특징 포인트를 트레이닝하는 트레이닝부를 더 포함하는, 얼굴 인식 장치.
According to claim 1,
and a training unit configured to train the facial feature points using the plurality of learning models.

제 2 항에 있어서,
상기 트레이닝부는 상기 복수의 학습 모델에 대한 학습 데이터로부터 검출된 얼굴의 특징 포인트를 기준으로 상기 학습 데이터를 적어도 하나 이상의 영역으로 분할하여 상기 얼굴 특징 포인트를 트레이닝하는 것인, 얼굴 인식 장치.
According to claim 2,
wherein the training unit divides the learning data into at least one area based on facial feature points detected from the learning data for the plurality of learning models to train the facial feature points.

제 1 항에 있어서,
상기 결과값 도출부는 상기 사용자의 얼굴 학습 모델을 이용하여, 상기 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 상기 사용자의 복수의 다른 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출하는 것인, 얼굴 인식 장치.
According to claim 1,
The result value derivation unit derives a result value by comparing the user's facial feature points detected from the image and the user's facial feature points detected from a plurality of other images of the user using the user's face learning model. That is, a face recognition device.

제 2 항에 있어서,
상기 결과값 도출부는 상기 타인의 얼굴 학습 모델을 이용하여, 상기 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 타인의 이미지로부터 검출된 타인의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출하는 것인, 얼굴 인식 장치.
According to claim 2,
Wherein the result value derivation unit derives a result value through comparison between the facial feature points of the user detected from the image and the facial feature points of the other person detected from the image of the other person using the face learning model of the other person. recognition device.

제 2 항에 있어서,
상기 영상으로부터 조명 성분을 추출하는 조명 성분 추출부를 더 포함하되,
상기 에러값 도출부는 상기 조명 학습 모델을 이용하여, 상기 이미지로부터 검출된 사용자의 얼굴 특징 포인트와 상기 조명 성분을 변화시킨 상기 이미지로부터 검출된 사용자의 얼굴 특징 포인트 간의 비교를 통해 결과값을 도출하는 것인, 얼굴 인식 장치.
According to claim 2,
Further comprising an illumination component extractor extracting an illumination component from the image,
The error value derivation unit derives a resultant value through comparison between facial feature points of the user detected from the image and feature points of the user's face detected from the image in which the lighting component is changed using the lighting learning model. Person, face recognition device.

제 3 항에 있어서,
상기 에러값 도출부는 상기 사용자의 얼굴 학습 모델 또는 상기 타인의 얼굴 학습 모델을 도출된 결과값 및 상기 도출된 결과값이 사용자일 확률에 기초하여 에러값을 도출하는 것인, 얼굴 인식 장치.
According to claim 3,
Wherein the error value derivation unit derives an error value based on a result value derived from the user's face learning model or the other person's face learning model and a probability that the derived result value is a user.

제 2 항에 있어서,
상기 결과값 도출부는 상기 포즈 학습 모델을 이용하여 상기 이미지로부터 검출된 사용자의 얼굴 특징 포인트에 기초하여 이미지 변환 행렬을 생성하고, 상기 생성된 이미지 변환 행렬을 이용하여 포즈 시뮬레이션을 통해 상기 추출된 이미지를 2차원 이미지로 변환하고, 상기 변환된 2차원 이미지에 대한 포즈 시뮬레이션의 결과값을 도출하는 것인, 얼굴 인식 장치.
According to claim 2,
The resulting value derivation unit generates an image conversion matrix based on the facial feature points of the user detected from the image using the pose learning model, and generates the extracted image through pose simulation using the generated image conversion matrix. Converting to a two-dimensional image, and deriving a result value of pose simulation for the converted two-dimensional image, face recognition device.

제 8 항에 있어서,
상기 에러값 도출부는 카메라의 초점 거리에 대한 상기 포즈 시뮬레이션의 결과값에 포함된 상기 사용자의 포즈 추정 결과를 계산하여 에러값을 도출하는 것인, 얼굴 인식 장치.
According to claim 8,
Wherein the error value derivation unit derives an error value by calculating a result of estimating the user's pose included in the resultant value of the pose simulation for the focal length of the camera.

제 1 항에 있어서,
상기 얼굴 인식부는 상기 도출된 총 손실률에 기초하여 상기 얼굴을 인식하는 것인, 얼굴 인식 장치.
According to claim 1,
Wherein the face recognition unit recognizes the face based on the derived total loss rate, the face recognition device.

얼굴 인식 장치에서 영상으로부터 사용자의 얼굴을 인식하는 방법에 있어서,
영상을 입력 받는 단계;
상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하는 단계;
복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하는 단계;
상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하는 단계; 및
상기 도출된 에러값을 반영하여 상기 얼굴을 인식하는 단계를 포함하고,
상기 복수의 학습 모델은 사용자의 얼굴 학습 모델, 타인의 얼굴 학습 모델, 조명 학습 모델 및 포즈 학습 모델을 포함하고,
상기 각 학습 모델에 대해 도출된 에러값에 상기 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하여 총 손실률을 도출하는 단계를 더 포함하는 것인, 얼굴 인식 방법.
A method for recognizing a user's face from an image in a face recognition device,
Receiving an image;
extracting an image including a face of a user included in the image, and detecting feature points of the user's face from the extracted image;
deriving result values for the detected facial feature points using a plurality of learning models;
deriving an error value for each learning model based on the derived result value; and
Recognizing the face by reflecting the derived error value;
The plurality of learning models include a user's face learning model, another person's face learning model, a lighting learning model, and a pose learning model,
The face recognition method further comprising deriving a total loss rate by reflecting a predetermined weight according to the importance of each learning model to the error value derived for each learning model.

제 11 항에 있어서,
상기 복수의 학습 모델을 이용하여 상기 얼굴 특징 포인트를 트레이닝하는 단계를 더 포함하는, 얼굴 인식 방법.
According to claim 11,
Further comprising the step of training the facial feature points using the plurality of learning models, face recognition method.

제 12 항에 있어서,
상기 트레이닝하는 단계는 상기 복수의 학습 모델에 대한 학습 데이터로부터 검출된 얼굴의 특징 포인트를 기준으로 상기 학습 데이터를 적어도 하나 이상의 영역으로 분할하여 상기 얼굴 특징 포인트를 트레이닝하는 것인, 얼굴 인식 방법.
According to claim 12,
Wherein the training step comprises training the facial feature points by dividing the learning data into at least one region based on facial feature points detected from the learning data for the plurality of learning models.

제 11 항에 있어서,
상기 얼굴을 인식하는 단계는 상기 도출된 총 손실률에 기초하여 상기 얼굴을 인식하는 것인, 얼굴 인식 방법.
According to claim 11,
Wherein the step of recognizing the face is recognizing the face based on the derived total loss rate.

얼굴 인식 장치에서 사용자의 얼굴을 인식하는 명령어들의 시퀀스를 포함하는 매체에 저장된 컴퓨터 프로그램에 있어서,
상기 컴퓨터 프로그램은 컴퓨팅 장치에 의해 실행될 경우,
영상을 입력 받고,
상기 영상에 포함된 사용자의 얼굴을 포함하는 이미지를 추출하고, 상기 추출된 이미지로부터 사용자의 얼굴 특징 포인트를 검출하고,
복수의 학습 모델을 이용하여 상기 검출된 얼굴 특징 포인트에 대한 결과값을 도출하고,
상기 도출된 결과값에 기초하여 각 학습 모델에 대한 에러 값을 도출하고,
상기 도출된 에러값을 반영하여 상기 얼굴을 인식하고,
상기 복수의 학습 모델은 사용자의 얼굴 학습 모델, 타인의 얼굴 학습 모델, 조명 학습 모델 및 포즈 학습 모델을 포함하고,
상기 각 학습 모델에 대해 도출된 에러값에 상기 각 학습 모델의 중요도에 따라 기설정된 가중치를 반영하여 총 손실률을 도출하도록 하는 명령어들의 시퀀스를 포함하는, 매체에 저장된 컴퓨터 프로그램.
A computer program stored in a medium containing a sequence of instructions for recognizing a user's face in a face recognition device,
When the computer program is executed by a computing device,
input video,
extracting an image including a user's face included in the image, and detecting a user's facial feature point from the extracted image;
Deriving result values for the detected facial feature points using a plurality of learning models;
An error value for each learning model is derived based on the derived result value,
Recognizing the face by reflecting the derived error value;
The plurality of learning models include a user's face learning model, another person's face learning model, a lighting learning model, and a pose learning model,
A computer program stored in a medium comprising a sequence of instructions for deriving a total loss rate by reflecting a predetermined weight according to the importance of each learning model to the error value derived for each learning model.