KR102476016B1

KR102476016B1 - Apparatus and method for determining position of eyes

Info

Publication number: KR102476016B1
Application number: KR1020150152053A
Authority: KR
Inventors: 허진구; 홍 타오; 리우 지후아; 남동경; 수에 캉; 리 웨이밍; 왕 씨잉; 마 젠규; 왕 하이타오; 조우 밍카이
Original assignee: 삼성전자주식회사
Priority date: 2015-01-29
Filing date: 2015-10-30
Publication date: 2022-12-09
Also published as: CN105989326B; KR20160093523A; CN105989326A

Abstract

안면 영상에서 안구 영역을 식별하는 단계, 상기 안구 영역에서 2차원 특징을 확인하는 단계, 상기 2차원 특징에 기초하여 3차원 타겟 모델을 확정하고, 상기 3차원 타겟 모델에 기초하여 3차원 위치 정보를 확정하는 단계를 포함하는 안구의 위치 정보를 확정하는 방법이 제공된다.Identifying an eyeball region in the face image, identifying a 2D feature in the eyeball region, determining a 3D target model based on the 2D feature, and obtaining 3D location information based on the 3D target model. A method for determining the position information of the eyeball, including the step of determining, is provided.

Description

안구 위치 정보 확정 방법 및 장치{APPARATUS AND METHOD FOR DETERMINING POSITION OF EYES}Eyeball position information determination method and apparatus {APPARATUS AND METHOD FOR DETERMINING POSITION OF EYES}

3차원 입체 디스플레이 기술에 연관된다. 보다 구체적으로는, 3차원 안구 위치 정보의 확정 방법 및 장치에 연관된다.It is related to 3D stereoscopic display technology. More specifically, it relates to a method and apparatus for determining three-dimensional eyeball position information.

2차원 디스플레이 기술과 비교하면, 3차원 디스플레이 기술은 현실을 재현하여 사용자에게 디스플레이 내용을 직접 체험하는 감각을 줄 수 있고, 오늘날 과학연구, 오락, 의료, 군사 등의 영역에서 많은 관심이 집중되고 있다. 영상의 형성 원리에 따라, 3차원 디스플레이 기술은 두 눈의 시각 차이에 기초한 3차원 안경식 디스플레이 기술 및 홀로그래픽, 광학 격자를 대표로 하는 3차원 무안경식 디스플레이 기술로 나뉠 수 있다.
3차원 디스플레이 시스템은 사용자가 특수장비(예를 들면, 편파안경 또는 헬멧)를 착용하여야만 3차원 입체 영상을 볼 수 있다는 점에서 불편함이 존재하였다. 이에 따라, 근래에는 3차원 무안경식 디스플레이 기능을 가진 태블릿 PC 및 스마트 폰 등의 3차원 무안경식 디스플레이 장비가 점차적으로 나타나고 있다.Compared to 2D display technology, 3D display technology can reproduce reality and give users a sense of directly experiencing the contents of the display, and today a lot of attention is being focused in areas such as scientific research, entertainment, medical care, and military. . Depending on the principle of image formation, 3D display technology can be divided into 3D glasses-type display technology based on the visual difference between the two eyes and 3D glasses-free display technology, which is representative of holographic and optical grids.
The 3D display system is inconvenient in that a user can view a 3D stereoscopic image only when a user wears special equipment (eg, polarized glasses or a helmet). Accordingly, in recent years, 3D autostereoscopic display devices such as tablet PCs and smart phones having a 3D autostereoscopic display function are gradually appearing.

일실시예에 따른 안구의 위치 정보를 확정(determine)하는 방법은, 안면 영상에서 안구 영역(eye area)을 식별(identify)하는 단계, 상기 안구 영역에서 2차원(2D) 특징(feature)을 확인(verify)하는 단계, 및 상기 2차원 특징에 기초하여 3차원 타겟 모델(3D target model)을 확정(determine)하고, 상기 3차원 타겟 모델에 기초하여 3차원 위치 정보를 확정하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 2차원 특징에 기반하여 상기 3차원 타겟 모델을 확정하는 단계는, 상기 2차원 특징에 연관되는 파라미터를 구하는 단계, 및 상기 파라미터에 기초하여 상기 3차원 타겟 모델을 구축하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 3차원 타겟 모델에 기초하여 상기 3차원 위치 정보를 확정하는 단계는, 상기 3차원 타겟 모델 및 상기 2차원 특징을 이용하여 행렬을 구하는 단계, 및 상기 행렬에 기초하여 상기 3차원 위치 정보를 확정하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 안구 영역에서 2차원 특징을 확인하는 단계는, 상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계, 및 상기 안구가 정지 상태가 아닌 것을 판단한 경우, 상기 안구 영역에서 상기 2차원 특징을 확인하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계는, 상기 안구가 정지 상태인 경우, 이전에 확정한 3차원 안구 위치정보를 현재의 안구에 대한 3차원 위치 정보로 확정하는 단계를 더 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계는, 현재의 안구 영역의 영상 프레임과 이전의 안구 영역의 영상 프레임 사이의 표준화 관련 계수를 계산하는 단계, 및 표준화 관련계수가 임계 값을 초과할 때, 상기 안구가 정지상태임을 판단하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 3차원 위치 정보를 디스플레이의 3차원 좌표계에 전환하는 단계, 및 상기 전환 결과에 기초하여 상기 디스플레이의 3차원 영상을 조절 또는 다시 랜더링하는 단계를 더 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 안구 영역에서 2차원 특징을 확인하는 단계는, 로컬 바이너리 패턴(LBP: local binary pattern)에 기초한 지도 하강법(SDM: Supervised Descent Method) 모델을 확정하는 단계, 및 상기 지도 하강법 모델(SDM)을 이용하여 상기 안구 영역에서 2차원 특징을 확인(verify)하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 지도 하강법 모델(SDM)을 확정하는 단계는, 촬영된 샘플 영상에서 안구 영역을 측정하여 샘플 영역(sample area)을 확정하는 단계, 및 상기 샘플 영역에서 측정(measure)한 샘플 특징(sample feature)을 이용하여 지도 하강법 모델에 대하여 반복 트레이닝(repetitive training)하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 지도 하강법 모델을 반복 트레이닝을 하는 단계는, 초기 반복 단계에서 샘플 영역에 대하여 대략적 특징을 추출하고 지도 하강법 모델을 트레이닝하는 단계, 및 후속 반복 단계에서, 샘플 영역에 대하여 정밀한 특징을 추출하고 지도 하강법 모델을 트레이닝하는 단계를 포함하고, 상기 대략적 특징은 방향경사 히스토그램 특징(HOG: Histogram of Oriented Gradients), 다중 블록 로컬 바이너리 패턴 특징(MB-LBP: multi-block local binary pattern), 로버스트 특징(robust feature), 객체 요구 매개자 특징 중의 적어도 하나의 특징을 포함하고, 상기 정밀한 특징은 로컬 바이너리 패턴 특징, 가보어 작은 파도 특징, 이산 코사인 변환 특징, 바이너리 로버스트 독립 기초 특징(binary robust independent elementary feature) 중의 적어도 하나의 특징을 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 반복 단계는, 서로 다른 치수 공간에서, 상기 샘플 영역에 대하여 특징을 추출하고, 이전의 반복과정에서 획득한 지도 하강법 모델을 트레이닝 하는 단계, 각 치수 공간에서 트레이닝한 지도 하강법 모델을 각각 미리 측정한 샘플 영역의 특징과 비교하는 단계, 및 비교 결과에 따라, 복수의 지도 하강법 모델 중 어느 하나의 지도 하강법 모델을 선택하여 다음 번 반복과정에 사용하는 단계를 포함할 수 있다.
일실시예에 따른 안구의 위치 정보를 확정하는 방법에 있어서, 상기 촬영한 안면 영상에서 상구 안구 영역을 검출하는 단계는, 상기 안구의 위치를 결정하고, 상기 결정한 위치에 기초하여 가상의 안구 프레임을 생성하는 단계, 및 상기 가상의 안구 프레임에 기초하여 현재 프레임에 대한 안면 영상으로부터 상기 안구 영역을 획득하는 단계를 포함하고, 상기 결정한 위치는 2차원 특징의 위치 정보에 관련될 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 안면 영상에서 안구 영역을 식별하는 식별부, 상기 안구 영역에서 2차원 특징을 확인하는 특징 확인부, 및 상기 2차원 특징에 기초하여 3차원 타겟 모델을 확정하고, 상기 3차원 타겟 모델에 기초하여 3차원 위치 정보를 확정하는 3차원 위치 정보 확정부를 포함할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 3차원 위치 정보 확정부는, 상기 2차원 특징을 확정하여 파라미터를 구하고, 상기 파라미터에 기초하여 상기 3차원 타겟 모델을 구축하는 모델 구축부, 상기 3차원 타겟 모델 및 상기 2차원 특징을 이용하여 행렬을 구하는 행렬 계산부, 및 상기 3차원 타겟 모델 및 상기 행렬에 기초하여 3차원 위치 정보를 확정하는 위치 정보 확정부를 포함할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 측정 모듈이 측정한 상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 정지 판단부를 더 포함하고, 상기 특징 확인부는 상기 안구가 정지 상태가 아닌 경우, 상기 안구 영역에서 2차원 특징을 확인할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 특징 확정부는 상기 안구가 정지 상태인 경우, 이전에 확정한 3차원 안구 위치 정보를 현재의 3차원 안구 위치 정보로 할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 정지 판단부는 현재의 안구 영역의 영상 프레임과 이전의 안구 영역의 영상 프레임 사이의 표준화 관련계수를 계산하고, 상기 표준화 관련계수가 임계 값을 초과할 때, 상기 안구가 정지상태임을 판단할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 3차원 위치 정보 확정부가 확정한 상기 3차원 위치 정보를 디스플레이의 3차원 좌표계에 전환하는 좌표계 전환부, 및 상기 3차원 위치 정보에 근거하여 상기 디스플레이의 3차원 영상을 조절 또는 다시 랜더링하는 영상 조절부를 더 포함하는 장치.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 특징 확정부는 지도 하강법 모듈을 이용하여 상기 안구 영역에서 2차원 특징을 확정할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 정지 판단부는 현재의 안구 영역의 영상 프레임과 이전의 안구 영역의 영상 프레임 사이의 표준화 관련계수를 계산하고, 상기 표준화 관련계수가 임계 값을 초과할 때, 상기 안구가 정지상태임을 판단할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 3차원 위치 정보 확정부가 확정한 상기 3차원 위치 정보를 디스플레이의 3차원 좌표계에 전환하는 좌표계 전환부, 및 상기 3차원 위치 정보에 근거하여 상기 디스플레이의 3차원 영상을 조절 또는 다시 랜더링하는 영상 조절부를 더 포함할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 상기 특징 확정부는 지도 하강법 모듈을 이용하여 상기 안구 영역에서 2차원 특징을 확정할 수 있다.
일실시예에 따른 안구 위치 정보 확정 장치에 있어서, 촬영한 샘플 영상에서 안구 영역을 측정하여 샘플 영역을 확정하고, 상기 샘플 영역에서 측정한 샘플 특징을 이용하여 지도 하강법 모델을 반복 트레이닝을 하는 지도 하강법 모델 트레이닝부를 더 포함할 수 있다.A method for determining position information of an eyeball according to an embodiment includes identifying an eye area in a face image, and identifying a two-dimensional (2D) feature in the eye area. (verifying), and determining a 3D target model based on the 2D feature, and determining 3D location information based on the 3D target model. have.
In the method for determining the position information of the eyeball according to an embodiment, the step of determining the 3D target model based on the 2D feature comprises: obtaining a parameter associated with the 2D feature; and constructing the 3D target model based on the method.
In the method for determining the position information of the eyeball according to an embodiment, the step of determining the 3-dimensional position information based on the 3-dimensional target model comprises generating a matrix using the 3-dimensional target model and the 2-dimensional feature. It may include obtaining, and determining the 3D location information based on the matrix.
In the method for determining position information of an eyeball according to an embodiment, the checking of a 2D feature in the eyeball region comprises: determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region; and checking the 2D feature in the eyeball region when it is determined that the eyeball is not in a stationary state.
In the method for determining the position information of the eyeball according to an embodiment, the step of determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region includes: The method may further include determining the 3D eyeball location information as 3D location information for the current eyeball.
In the method for determining the position information of the eyeball according to an embodiment, the step of determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region includes: an image frame of the current eyeball region and a previous eyeball region. Calculating a standardization-related coefficient between image frames of , and determining that the eyeball is in a stationary state when the standardization-related coefficient exceeds a threshold value.
In the method for determining position information of an eyeball according to an embodiment, converting the 3D position information into a 3D coordinate system of a display, and adjusting or re-rendering a 3D image of the display based on a result of the conversion. It may further include steps to do.
In the method for determining the position information of the eyeball according to an embodiment, the step of identifying a two-dimensional feature in the eyeball area may include a supervised descent method (SDM) based on a local binary pattern (LBP). ) model, and verifying 2D features in the eye region using the supervised descent model (SDM).
In the method for determining the position information of the eyeball according to an embodiment, the step of determining the map descent model (SDM) may include determining a sample area by measuring an area of the eyeball in a captured sample image. , and performing repetitive training on the supervised descent model using sample features measured in the sample region.
In the method for determining the position information of the eyeball according to an embodiment, the step of repeatedly training the supervised descent model may include extracting approximate features of a sample area and training the supervised descent model in an initial iteration step. , and in subsequent iteration steps, extracting precise features for the sample region and training a supervised descent model, wherein the approximate features are Histogram of Oriented Gradients (HOG) features, multi-block local binary patterns. At least one feature of a multi-block local binary pattern (MB-LBP), a robust feature, and an object request intermediary feature, wherein the precise feature is a local binary pattern feature, a small wave feature, It may include at least one of a discrete cosine transform feature and a binary robust independent elementary feature.
In the method for determining the position information of the eyeball according to an embodiment, the iterative step may include extracting features for the sample area in different dimension spaces, and training a supervised descent model obtained in a previous iterative process. a step of comparing the supervised descent model trained in each dimension space with the features of each pre-measured sample area, and selecting one of a plurality of supervised descent models from among a plurality of supervised descent models according to the comparison result It can include steps to be used in the next iteration.
In the method for determining the position information of the eyeball according to an embodiment, the detecting of the upper eyeball region from the photographed face image may include determining the position of the eyeball and constructing a virtual eyeball frame based on the determined position. generating, and acquiring the eyeball region from a face image for a current frame based on the virtual eyeball frame, wherein the determined location may be related to location information of a 2D feature.
An apparatus for determining eyeball position information according to an embodiment includes an identification unit identifying an eyeball region in a face image, a feature confirmation unit identifying a 2D feature in the eyeball region, and a 3D target model based on the 2D feature. and a 3D location information determination unit for determining 3D location information based on the 3D target model.
In the apparatus for determining eyeball position information according to an embodiment, the 3D position information determination unit determines the 2D feature, obtains a parameter, and builds the 3D target model based on the parameter; It may include a matrix calculation unit that obtains a matrix using a 3D target model and the 2D features, and a location information determination unit that determines 3D location information based on the 3D target model and the matrix.
The apparatus for determining eyeball position information according to an embodiment further includes a stop determination unit configured to compare frames before and after the eyeball area measured by the measurement module to determine whether the eyeball is in a stationary state, wherein the feature check unit When the eyeball is not in a stationary state, a 2D feature may be identified in the eyeball region.
In the apparatus for determining eyeball position information according to an embodiment, the feature determining unit may set previously determined 3D eyeball position information as current 3D eyeball position information when the eyeball is in a stationary state.
In the apparatus for determining eyeball position information according to an embodiment, the still determination unit calculates a standardization relation coefficient between a current eye region image frame and a previous eye region image frame, and the standardization relation coefficient exceeds a threshold value. When doing so, it can be determined that the eyeball is in a stationary state.
In the apparatus for determining eyeball position information according to an embodiment, a coordinate system conversion unit for converting the 3-dimensional position information determined by the 3-dimensional position information determination unit into a 3-dimensional coordinate system of a display; An apparatus further comprising an image adjusting unit for adjusting or re-rendering a 3D image on a display.
In the apparatus for determining eyeball position information according to an embodiment, the feature determining unit may determine a 2D feature in the eyeball region using a map descent method module.
In the apparatus for determining eyeball position information according to an embodiment, the still determination unit calculates a standardization relation coefficient between a current eye region image frame and a previous eye region image frame, and the standardization relation coefficient exceeds a threshold value. When doing so, it can be determined that the eyeball is in a stationary state.
In the apparatus for determining eyeball position information according to an embodiment, a coordinate system conversion unit for converting the 3-dimensional position information determined by the 3-dimensional position information determination unit into a 3-dimensional coordinate system of a display; It may further include an image adjusting unit that adjusts or re-renders the 3D image of the display.
In the apparatus for determining eyeball position information according to an embodiment, the feature determining unit may determine a 2D feature in the eyeball region using a map descent method module.
An apparatus for determining eyeball position information according to an embodiment, in which an eyeball region is measured in a captured sample image to determine a sample region, and a map descent method model is repeatedly trained using sample features measured in the sample region. A descent model training unit may be further included.

도 1은 일실시예에 따른 3차원 위치 정보 확정 방법을 나타내는 흐름도이다.
도 2는 일실시예에 따른 지도 하강모델을 이용하여 획득한 2차원 특징을 예시하는 도면이다.
도 3은 일실시예에 따른 안구 영역에 대한 50 영상 프레임의 표준화 관련계수를 도시한다.
도 4는 일실시예에 따른 3차원 타겟 모델을 도시한다.
도 5는 일실시예에 따른 지도 하강모델 트레이닝 방법의 흐름을 도시한다.
도 6은 일실시예에 따른 3차원 위치 정보의 확정장치의 구조를 도시한다.
도 7은 일실시예에 따른 3차원 위치 정보 확정모듈의 구조를 도시한다. 1 is a flowchart illustrating a method for determining 3D location information according to an exemplary embodiment.
2 is a diagram illustrating 2D features acquired using a map descent model according to an embodiment.
FIG. 3 illustrates standardization related coefficients of 50 image frames for an eye area according to an embodiment.
4 shows a 3D target model according to one embodiment.
5 illustrates a flow of a method for training a supervised descent model according to an embodiment.
6 illustrates a structure of an apparatus for determining 3D location information according to an embodiment.
7 illustrates the structure of a 3D location information determination module according to an embodiment.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.
제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.
어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.
단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.
다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.
하기에서 설명될 실시예들은 사용자의 지문을 인식하는 데 사용될 수 있다. 이하, 사용자의 지문을 인식하는 동작은 그 사용자를 인증하거나 식별하는 동작을 포함할 수 있다. 사용자를 인증하는 동작은, 예를 들어 그 사용자가 기 등록된 사용자인지 여부를 판단하는 동작을 포함할 수 있다. 이 경우, 사용자를 인증하는 동작의 결과는 참 또는 거짓으로 출력될 수 있다. 사용자를 식별하는 동작은, 예를 들어 그 사용자가 기 등록된 복수의 사용자들 중 어느 사용자에 해당하는지를 판단하는 동작을 포함할 수 있다. 이 경우, 사용자를 식별하는 동작의 결과는 어느 하나의 기 등록된 사용자의 아이디로 출력될 수 있다. 만약 그 사용자가 기 등록된 복수의 사용자들 중 어느 사용자에도 해당하지 않는 경우, 그 사용자가 식별되지 않음을 알리는 신호가 출력될 수도 있다.
실시예들은 퍼스널 컴퓨터, 랩톱 컴퓨터, 태블릿 컴퓨터, 스마트 폰, 텔레비전, 스마트 가전 기기, 지능형 자동차, 키오스크, 웨어러블 장치 등 다양한 형태의 제품으로 구현될 수 있다. 예를 들어, 실시예들은 스마트 폰, 모바일 기기, 스마트 홈 시스템 등에서 사용자를 인증하는데 적용될 수 있다. 실시예들은 사용자 인증을 통한 결제 서비스에 적용될 수 있다. 또한, 실시예들은 사용자를 인증하여 자동으로 시동을 거는 지능형 자동차 시스템 등에도 적용될 수 있다. 이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.

도 1은 일실시예에 따른 3차원 위치 정보 확정 방법을 나타내는 흐름도이다.
일실시예에 따른 3차원 위치 정보 확정 방법은 안면 영상에서 안구 영역을 식별하는 단계(S101)를 포함할 수 있다.
위 단계에서는, 안면 영상에서 안구를 식별하고, 상기 식별된 부분을 안면 영상에서 추출하여 안구 영역으로 식별할 수 있다.
이 때, 안구 영역의 식별 효율과 정확도를 향상시키기 위하여 이전의 안구 영역 식별 결과에 기초하여 가상의 안구 포위 프레임을 생성할 수 있고, 생성한 가상의 안구 포위 프레임에 기초하여 현재 프레임의 안면 영상에서 안구 영역을 식별하여 획득할 수 있다.
이전의 안구 영역 식별 결과는 구체적으로, 이전의 안면 영상에 근거하여 확정한 2차원 특징의 위치 정보에 연관될 수 있다. 그러면, 비교적 높은 정밀도와 정확도를 구비한 2차원 특징의 위치정보가 생성한 가상 안구 포위 프레임을 이용하여 획득한 안구 영역의 영상은 안구 영역의 식별 효율과 정확도를 향상시킬 수 있다.
일실시예에 따른 3차원 위치 정보 확정 방법은 안구 영역에서 2차원 특징을 확정하는 단계(S102)를 포함할 수 있다.
본 단계에서, 이동장비는 미리 트레이닝 한 헤이지안 행렬(Hessian matrix)과 잔여항을 이용하여 안구 영역에서 2차원 특징을 확인할 수 있고; 아래 설명에서는 헤이지만 행렬과 잔여항을 지도 하강(Supervised Descent Method, 지도 하강법)모델이라 통칭한다.
도 2는 일실시예에 따른 지도 하강모델을 이용하여 획득한 2차원 특징을 예시하는 도면이다.
미리 트레이닝 한 지도 하강모델을 이용하여 2차원 특징을 확인하는 과정은, 안구 영역에서 로컬 바이너리 패턴(Local Binary Pattern, 로컬 바이너리 패턴) 등의 정밀한 특징을 추출하고, 추출한 정밀한 특징에 기초하여 지도 하강모델을 이용하여 반복 계산하여 2차원 특징을 획득할 수 있다.
일실시예에서, 지도 하강모델을 이용하여 계산한 2차원 특징의 정확도를 향상시키기 위하여, 단계(S101)에서 생성한 가상 안구 포위 프레임에 근거하여 현재 안면 프레임 영상에서 안구 영역을 획득할 수 있다. 안구 영역에서 2차원 특징을 확정하기 전에 획득한 안구 영역을 특정크기로 축소 및 확대한 것을 식별한 안구 영역의 영상으로 한다.
사용자의 두 눈이 카메라를 똑바로 보지 않은 상태, 예를 들면, 머리부분이 한측으로 기울어진 상태를 고려할 수 있다. 단계(S101)에서 안구 영역을 식별한 후, 획득한 안구 영역의 영상에 대하여 일정한 각도의 회전을 하여 안구가 카메라를 똑바로 보고 있도록 수 있다. 동일하게, 지도 하강모델을 이용하여 회전 후의 안구 영역중의 2차원 특징을 계산한 후, 역회전을 하여 단계(S101)에서 획득한 안구 영역의 2차원 특징을 확정할 수 있다.
일실시예에서, 상기 안구 영역에서 2차원 특징을 확인하는 단계는, 안구 영역의 전, 후 프레임을 비교하여 안구가 정지상태인지 여부를 판단할 수 있다. 정지상태가 아닌 것을 판단한 경우, 안구 영역에서 2차원 특징을 확인할 수 있다. 안구가 정지상태인 것을 판단한 경우, 이전에 확정한 3차원 위치 정보를 현재의 3차원 위치 정보로 확정할 수 있다.
안구가 정지 상태인지의 여부를 판단하기 위하여, 현재 안구 영역의 프레임을 이전의 안구 영역의 프레임과 비교하여 프레임 사이의 운동 정보가 존재하는지 확인하고, 프레임 사이의 운동 정보가 존재하지 않으면 안구가 정지상태임을 판단할 수 있다. 프레임 사이의 운동 정보는 구체적으로 영상 사이의 픽셀변화에 근거하여 판단할 수 있다.
일실시예에 따라, 현재의 안구 영역의 영상 프레임, 이전의 안구 영역의 영상 프레임 사이의 표준화 관련계수를 계산하여 전, 후 프레임을 비교할 수 있고; 만약에 계산한 표준화 관련계수가 임계 값보다 크면, 안구가 정지상태임을 판단하고; 그렇지 않으면, 안구가 정지상태가 아님을 판단할 수 있다.
구체적으로, 아래의 식(1)에 근거하여 현재의 프레임 안구 영역의 영상과 그전의 프레임 안구 영역의 영상 사이의 표준화 관련계수(normalized correlation coefficient) NCC를 계산할 수 있다.

C(x, y), P(x, y)는 각각 현재의 안구 영역의 영상 프레임, 이전의 안구 영역의 영상 프레임에서 좌표가 (x, y)인 곳의 픽셀의 그레이(gray)값이고;

,

는 각각 현재의 안구 영역의 영상 프레임, 이전의 안구 영역의 영상 프레임의 그레이 평균 값이다.
도 3은 일실시예에 따른 안구 영역에 대한 50 영상 프레임의 표준화 관련계수를 도시한다.
도 3을 살펴보면, 안구가 정지하고 있을 때, 전, 후 프레임의 표준화 관련계수는 0.995이상의 매우 높은 값을 갖는 것을 알 수 있다. 눈을 깜빡거리는 동작이 나타날 때, 표준화 관련계수는 현저하게 낮아지고, 도 3과 같이 제32프레임은 이전의 표준화 관련계수에 비해 0.78로 낮아진 것을 알 수 있다. 이에 따라, 임계 값을 0.99로 취하면 정지 상태와 활동 상태의 안구를 구분할 수 있음을 알 수 있다.
3차원 위치 정보의 정확도를 향상시키기 위하여, 2차원 특징에 대하여 신뢰도 평가를 할 수 있다. 획득한 지도 하강모델에 대하여, 2차원 특징을 확정하는 과정에서 추출한 로컬 바이너리 패턴 특징, 로컬 바이너리 패턴 특징에 의해 확정한 2차원 특징의 위치 정보에 기초하여 신뢰도 평가를 할 수 있다.
지도 하강모델을 이용하여 2차원 특징을 확정하는 과정에서 로컬 바이너리 패턴 특징의 추출이 이루어지며, 추출한 로컬 바이너리 패턴 특징에 근거하여 미리 트레이닝을 한 지도 하강모델을 이용하여 반복 계산을 하여 2차원 특징을 확인한다.
일실시예에 따라, 미리 트레이닝한 서포트 벡터 머신(Support Vector Machine, 서포트 벡터 머신)분류기를 이용하여 로컬 바이너리 패턴 특징에 대하여 분류를 한 후, 상기 로컬 바이너리 패턴 특징이 실제의 안구 영역의 로컬 바이너리 패턴 특징을 반영할 수 있는지 여부를 확인할 수 있다. 이 때 실제의 로컬 바이너리 패턴 특징을 반영하고 있다고 판단되면, 평가 결과가 2차원 특징에 매칭되고, 그렇지 않으면 평가 결과가 2차원 특징에 매칭되지 않을 수 있다. 동일하게, 평가 결과가 2차원 특징에 매칭되는 경우, 단계(S103)에서 2차원 특징에 근거하여 3차원 타겟 모델을 확정한다. 또한, 평가 결과가 2차원 특징에 매칭되지 않는 경우, 단계(S101, S102)로 되돌아 가서 안구 영역의 식별 및 2차원 특징을 확인하는 단계를 반복한다.
일실시예에서, 서포트 벡터 머신 분류기는 정, 부의 샘플 특징을 획득할 수 있다. 여기서, 안구의 실제 형태를 정확히 반영할 수 있는 안구 영역을 정의 샘플로 획득하고, 안구의 실제 형태를 정확히 반영하지 못하는 안구 영역을 부의 샘플로 측정할 수 있다. 각각에 대하여, 정의 샘플로 측정한 안구 영역에서 추출한 로컬 바이너리 패턴 특징을 정의 샘플 특징으로 하고; 부의 샘플로 측정한 안구 영역에 대하여 추출한 로컬 바이너리 패턴 특징을 부의 샘플 특징으로 정할 수 있다.
도 4는 일실시예에 따른 3차원 타겟 모델을 도시한다.
일실시예에 따른 안구 위치 정보를 확정하는 방법은 2차원 특징에 기초하여 3차원 타겟 모델을 확정하는 단계(S103)를 포함할 수 있다.
위 단계는, 단계(S102)에서 확정한 2차원 특징에 연관되는 파라미터를 구할 수 있다. 구체적으로, 아래와 같은 모델 방정식에 근거하여 안구 파라미터

와

를 구한다.

그 중에서,

는 통용되는 안구 3차원 모델이고 이는 통용되는 3차원 안면 모델에서 획득될 수 있다.

는 통용되는 3차원 평균 안구 형태이고, S는 안면 형태의 벡터인 형태 벡터를 나타내고, A는 표정에 대한 벡터인 표정를 나타내며,

와

는 각각 형태벡터 S, 표정벡터 A에 대응하는 안구 파라미터일 수 있다. 실제로, 이전에 촬영한 안면 영상에 근거하여 형태벡터 S를 획득하고, 이전에 촬영한 복수의 프레임 안면 영상에 근거하여 표정벡터 A를 획득한다.
계속하여, 안구 파라미터

와

, 통용되는 3차원 평균 안구 형태

, 미리 피팅하여 획득한 형태벡터S, 표정벡터A에 근거하여 3차원 타겟 모델

을 구축하며 이는 도4와 같다.
일실시예에 따른 안구 위치 정보를 확정하는 방법은 3차원 타겟 모델에 기초하여 3차원 위치 정보를 확정하는 단계(S104)를 더 포함할 수 있다.
구체적으로, 3차원 타겟 모델, 및 단계(S102)에서 확정한 2차원 특징을 이용하여 강성 행렬을 구하고, 강성 행렬 및 단계(S103)에서 피팅한 3차원 타겟 모델에 근거하여 3차원 위치 정보를 확정할 수 있다.
본 단계에서, 강성 행렬은 아래와 같은 타겟 함수의 최소화 를 통하여 획득할 수 있다.

그 중에서, P는 원근 투영 변환 행렬이고, Q는 강성변환행렬이고,

은 3차원 타겟 모델이며;

는 2차원 특징이다.
일실시예에 따른 단계(S104)에서 3차원 위치 정보를 확정한 후, 아래와 같은 단계(S105, S106)를 통하여 3차원 디스플레이를 할 수 있다.
일실시예에 따른 안구 위치 정보를 확정하는 방법은 3차원 위치 정보를 이동장비의 디스플레이의 3차원 좌표계에 전환하는 단계(S105)를 더 포함할 수 있다.
단계(S104)를 통하여 확정한 3차원 위치 정보는 안구를 촬영하는 카메라의 3차원 좌표계에 기초하고, 사용자가 관람한 3차원 영상은 디스플레이 스크린의 3차원 좌표계에 기초할 수 있다. 따라서, 카메라의 3차원 좌표계를 디스플레이 스크린의 3차원 좌표계에 대한 강성 행렬에 근거하여 3차원 위치 정보를 이동장비의 디스플레이 스크린의 3차원 좌표계로 전환할 수 있다.
일실시예에 따른 안구 위치 정보를 확정하는 방법은 전환 결과에 기초하여 디스플레이되는 3차원 영상내용에 대하여 조절 또는 다시 랜더링하는 단계(S106)를 포함할 수 있다.
실제로, 카메라의 내부 파라미터, 예를 들면 카메라 메인 포인트 위치, 카메라 초점거리 등을 결합하고 전환 후의 3차원 위치 정보에 근거하여 안구가 이동장비의 디스플레이 스크린에 대한 3차원 위치정보를 확정할 수 있다. 계속하여, 디스플레이 스크린에 대한 3차원 위치정보에 기초하여, 디스플레이 내용을 조절하거나 다시 랜더링하여 사용자가 현재 위치에서 관람 가능한 정확한 3차원 영상을 실현할 수 있다.
도 5는 일실시예에 따른 지도 하강모델 트레이닝 방법의 흐름을 도시한다.
사실상, 상기 단계(S102)에서 제시한 지도 하강모델은 미리 트레이닝 된 것일 수 있다. 지도 하강모델에 관한 트레이닝 방법은 도 5와 같고 그 흐름도는 구체적으로 아래와 같은 단계를 포함한다.
일실시예에 따른 지도 하강 모델을 확정하는 단계는, 촬영된 샘플 영상에서 안구 영역을 측정하여 샘플 영역을 획득하는 단계(S501)를 포함할 수 있다.
위 단계에서, 영상수집장비(예를 들면, 카메라)를 이용하여 사용자에 대하여 샘플 안면 영상을 미리 수집할 수 있다. 샘플 안면 영상을 수집하는 동안에, 카메라는 사용자의 각종 자세에 대한 사용자의 안면에 대하여 촬영하여 샘플 안면 영상을 획득할 수 있다. 계속하여, 각 샘플 영상에 대하여 안구 영역을 식별하고 샘플 영역을 획득할 수 있다. 더 나아가, 샘플 영역을 수집한 후, 각 샘플 영역에 대하여 샘플 특징을 획득할 수 있다.
일실시예에 따른 지도 하강 모델을 확정하는 단계는, 상기 샘플 영역에서 측정한 샘플 특징을 이용하여 지도 하강법 모델에 대하여 반복 트레이닝 하는 단계(S502)를 포함할 수 있다. 구체적으로, 반복된 시간의 선후 관계에 기초하여 지도 하강모델의 트레이닝 과정을 두 개의 단계: 초기반복단계, 후속반복단계로 구분할 수 있다. 지도 하강모델의 정밀도를 향상시키기 위하여 초기반복단계에서 샘플 영역에 대하여 대략적 특징을 추출하여 지도 하강모델에 대한 트레이닝을 하고; 후속 반복단계에서, 샘플 영역에 대하여 정밀한 특징을 추출하여 지도 하강모델에 대하여 트레이닝 할 수 있다. 대략적 특징은 방향 경사 히스토그램(Histogram of Oriented Gradient, HOG)특징, 다중 블록 로컬 바이너리 패턴(Multiple block Local Binary Pattern, MBLBP)특징, 스피드업 로버스트 특징(Speed Up Robust Features, SURF)특징, ORB 특징(oriented FAST and rotated BRIEF (ORB) feature) 등 중의 적어도 하나의 특징을 포함할 수 있다.
정밀한 특징은 로컬 바이너리 패턴 특징, 가버 특징, 이산 코사인 변환 특징(Discrete Cosine Transformation, DCT)특징, BRIEF 특징(Binary Robust Independent Elementary Features, BRIEF)특징 등 중의 적어도 하나의 특징을 포함한다.
한 번의 반복과정에서, 설정한 축소 및 확대 비율에 따라 샘플 영역을 축소 및 확대하여 서로 다른 치수 공간의 샘플 영역을 획득할 수 있다. 계속하여, 서로 다른 치수 공간의 샘플 영역에 대하여 특징을 추출하고, 이전의 반복과정에서 획득한 지도 하강모델에 대하여 트레이닝 할 수 있다. 샘플 영역에 대하여 추출한 특징은 대략적 특징 또는 정밀한 특징일 수 있다. 이 때, 각 치수 공간에서 획득한 지도 하강모델을 트레이닝하고, 상기 치수 공간의 샘플 영역에서 획득한 2차원 특징을 상기 치수 공간에서 트레이닝 한 지도 하강모델의 출력 결과로 할 수 있다.
각 치수 공간에서 트레이닝 한 지도 하강모델의 출력 결과를 각각 미리 측정한 샘플 특징과 비교할 수 있고; 유사도가 가장 높은 출력 결과가 대응하는 지도 하강모델을 다음 번 반복과정에 이용할 수 있다. 실제 응용에서, 유사도가 가장 높은 출력 결과를 갖는 지도 하강모델에 대하여 상기 지도 하강모델이 사용한 본 영역의 치수 공간을 트레이닝 하여 상기 지도 하강모델의 최적의 치수 공간으로 할 수 있고; 또한 최적 치수 공간을 갖는 상기 샘플 영역을 다음 번 반복과정의 샘플 영역으로 한다.
살펴본 바와 같이, 두 가지 종류의 특징을 사용하는 것을 통하여 지도 하강모델을 선, 후로 트레이닝 하고, 반복 트레이닝 할 때마다 최적의 치수 공간을 선택하고, 선택된 최적의 치수 공간에서 트레이닝 한 지도 하강모델을 다음 번 반복 트레이닝의 기초로 하여 2차원 특징의 정확도를 향상시킬 수 있다.

도 6은 일실시예에 따른 3차원 위치 정보 확정 장치의 구조를 도시한다.
일실시예에 따른 3차원 위치 정보 확정 장치는 식별부(601), 특징 확인부(602), 및 3차원 위치 정보 확정(603)을 포함할 수 있다.
식별부(601)는 안면 영상에서 안구 영역을 식별할 수 있다. 특징 확인부(602)는 식별부(601)에서 식별한 안구 영역에서 2차원 특징을 확인할 수 있다. 구체적으로, 특징 확인부(602)는 미리 저장한 지도 하강모델을 이용하여 안구 영역에서 2차원 특징을 확인할 수 있다.
3차원 위치 정보 확정모듈(603)은 특징 확인부(602)가 확정한 2차원 특징에 근거하여 3차원 타겟 모델을 확정한 후, 확정한 타깃 3차원 안구 모듈에 기초하여 3차원 위치 정보를 확정할 수 있다.
사용자가 비디오를 관람할 때, 머리부분은 대부분 시간은 상대적으로 고정된 위치에 있고, 그동안 안구는 이동장비의 디스플레이 스크린의 위치에 대하여 같다는 것을 고려하고; 따라서, 3차원 위치 정보의 식별효율의 확정을 향상시키기 위하여, 바람직하게, 3차원 위치 정보의 확정장치는 안구 정지 판단부(604)를 더 포함할 수 있다.
정지 판단부(604)는 식별부가 식별한 안구 영역의 전후 프레임의 유사성 또는 프레임 사이의 운동정보에 근거하여 안구가 정지상태인지 여부를 판단하고, 또한 판단결과를 출력한다.
구체적으로, 정지 판단부(604)은 안구 영역의 전후 프레임의 유사성 또는 프레임 사이의 운동정보에 근거하여 안구가 정지상태인지 여부를 판단한다. 예를 들면, 현재 프레임 안구 영역중의 영상, 그전의 프레임 안구 영역중의 영상 사이의 표준화 관련계수를 계산하고, 표준화 관련계수가 설정한 임계 값을 초과할 때, 안구가 정지상태임을 판단한다.
상응하게, 특징 확인부(602)이 안구 정지 판단부(604)에서 출력한 판단결과는 안구가 정지상태가 아니면, 안구 영역에서 2차원 특징을 확정한다. 특징 확인부(602)이 안구 정지 판단부(604)에서 출력한 판단결과는 안구가 정지상태이면, 그전에 확정한 3차원 위치 정보를 촬영한 현재 프레임 안면 영상의 3차원 위치 정보로 한다.
더 나아가, 3차원 위치 정보 확정모듈(603)이 확정한 3차원 위치 정보는 이동장비의 카메라의 3차원 좌표계에 기초한 것이고, 따라서, 안구가 이동장비의 디스플레이에 대한 3차원 위치를 획득하기 위하여, 3차원 위치 정보의 확정장치는 좌표계 전환모듈(605), 및 디스플레이 내용 조절모듈(606)을 더 포함할 수 있다.
그 중에서, 좌표계 전환부(605)는 3차원 위치 정보 확정부(603)가 확정한 3차원 위치 정보를 디스플레이 스크린의 3차원 좌표계에 전환하고; 디스플레이 내용 조절모듈(606)은 좌표계 전환모듈(605)이 전환 후의 3차원 위치 정보에 근거하여 디스플레이 스크린이 디스플레이 하는 3차원 영상내용에 대하여 조절 또는 다시 랜더링할 수 있다. 최종 확정한 3차원 위치 정보의 정확도를 향상시키기 위하여, 3차원 위치 정보의 확정장치는 신뢰도 평가부(미도시)을 더 포함할 수 있다.
신뢰도 평가부는 특징 확인부(602)가 지도 하강모델을 이용하여 2차원 특징을 확정하는 과정에서 추출한 로컬 바이너리 패턴 특징을 획득하고; 또한 분류기를 이용하여 획득한 로컬 바이너리 패턴 특징이 특징 확인부(602)이 확정한 2차원 특징에 대한 위치정보에 근거하여 신뢰평가를 한다.
구체적으로, 신뢰평가 모듈은 미리 저장한 서포트 벡터 머신 분류기를 이용하여 신뢰평가모듈에서 추출한 로컬 바이너리 패턴 특징에 대하여 분류를 한 후, 상기 로컬 바이너리 패턴 특징이 실제 안구 영역의 로컬 바이너리 패턴 특징을 정확하게 반영하는지 확인할 수 있다. 실제의 로컬 바이너리 특징을 반영하고 있다고 판단되면, 평가 결과가 2차원 특징에 매칭되고, 그러지 않으면 평가 결과가 2차원 특징이 매칭되지 않을 수 있다. 동일하게, 3차원 위치 정보 확정부(603)는 평가 결과가 2차원 특징에 매칭되는 경우 2차원 특징에 근거하여 3차원 타겟 모델을 확정할 수 있다. 더 나아가, 평가 결과가 3차원 위치 정보의 신뢰가 낮으면 신뢰평가모듈은 식별부(601)가 안구 영역의 식별을 다시 하도록 할 수 있다.
또 다른 일실시예에 따라, 3차원 위치 정보의 확정장치는 서포트 벡터 머신분류기 트레이닝 모듈을 더 포함할 수 있다.
서포트 벡터 머신분류기 트레이닝 모듈은 정의 샘플로 측정한 안구 영역에서 추출한 로컬 바이너리 패턴 특징을 정의 샘플 특징으로 하고; 부의 샘플로 측정한 안구 영역에서 추출한 로컬 바이너리 패턴 특징을 부의 샘플 특징으로 하며; 정, 부의 샘플 특징을 이용하여 서포트 벡터 머신 분류기를 트레이닝 한다.
본 발명의 실시예에서, 특징 확인부(602)에서 이용한 지도 하강모델은 미리 저장한 것이고 기타 장비로 트레이닝 할 수 있고, 3차원 위치 정보의 확정장치로도 미리 트레이닝 할 수 있다.
따라서, 바람직하게, 3차원 위치 정보의 확정장치는 지도 하강모델 트레이닝 모듈(미도시)을 더 포함할 수 있다.
지도 하강모델 트레이닝 모듈은 촬영한 샘플 안면 영상에서 검측한 안구 영역을 샘플 영역으로 하고, 샘플 영역에서 측정한 샘플 특징을 이용하여 지도 하강모델에 대하여 반복 트레이닝을 할 수 있다. 구체적으로, 지도 하강모델 트레이닝 모듈은 구체적으로, 샘플 수집부, 초기 반복부, 및 후속 반복부를 포함할 수 있다. 그 중에서, 샘플 수집부는 샘플 안면 영상에서 안구 영역을 식별한 것을 샘플 영역으로 한다. 초기 반복부는 초기 반복단계에서 샘플 수집부가 획득한 샘플 영역에 대하여 대략적 특징을 추출하고, 추출한 대략적 특징을 이용하여 지도 하강모델에 대하여 트레이닝 할 수 있다. 후속 반복부는 후속 반복단계에서 샘플 수집부가 획득한 샘플 영역에 대하여 정밀한 특징을 추출하고, 추출한 정밀특징을 이용하여 지도 하강모델에 대하여 트레이닝 할 수 있다. 대략적 특징은 방향 경사 히스토그램(Histogram of Oriented Gradient, HOG)특징, 다중 블록 로컬 바이너리 패턴(Multiple block Local Binary Pattern, MBLBP)특징, 스피드업 로버스트 특징(Speed Up Robust Features, SURF)특징, ORB 특징 등 중의 적어도 하나의 특징을 포함할 수 있다. 정밀한 특징은 로컬 바이너리 패턴 특징, 가버 특징, 이산 코사인 변환 특징(Discrete Cosine Transformation, DCT)특징, BRIEF 특징(Binary Robust Independent Elementary Features, BRIEF)특징 등 중의 적어도 하나의 특징을 포함한다.
한번의 반복과정에서, 초기 반복부 또는 후속 반복부는 설정한 축소 및 확대 비율에 따라 샘플 영역에 대하여 축소 및 확대를 하여 서로 다른 치수 공간의 샘플 영역을 획득할 수 있다. 계속하여, 서로 다른 치수 공간의 샘플 영역에 대하여 특징을 추출하고, 이전의 반복과정에서 획득한 지도 하강모델에 대하여 트레이닝 할 수 있다. 이 때, 각 치수 공간에서 획득한 지도 하강모델을 트레이닝하고, 상기 치수 공간의 샘플 영역에서 획득한 2차원 특징을 상기 치수 공간에서 트레이닝한 지도 하강모델의 출력 결과로 할 수 있다. 마지막에, 각 치수 공간에서 트레이닝 한 지도 하강모델의 출력 결과를 각각 미리 측정한 샘플 키 포인트와 유사도를 비교하고; 유사도가 가장 높은 출력 결과가 대응하는 지도 하강모델을 다음번 반복과정에 응용한다.
이러면, 두 종류의 특징을 사용하는 것을 통하여 선후로 지도 하강모델을 트레이닝 하고; 매번 반복 트레이닝을 할 때, 최적의 치수 공간을 선택하고, 또한 최적의 치수 공간에서 트레이닝 한 지도 하강모델을 다음번 반복 트레이닝의 기초로 하여 후속에 트레이닝을 이용하여 획득한 지도 하강모델이 계산한 2차원 특징의 정확도를 향상시킬 수 있다.
본 발명의 실시예에서, 안구 영역의 식별 효율과 정확도를 향상시키기 위하여, 식별부(601)는 그전의 안구 포지셔닝 결과에 기초하여 안구 영역의 식별할 수 있다.
구체적으로, 식별부(601)는 안구 포위 프레임 계산부, 안구 영역 획득부를 포함할 수 있다. 안구 포위 프레임 계산부는 이전의 안구 포지셔닝 결과에 기초하여 가상의 안구 포위 프레임을 생성할 수 있다. 이전의 안구 포지셔닝 결과는 이전의 안면 영상에 근거하여 확정한 2차원 특징의 위치정보에 연관될 수 있다.
안구 영역 획득부는 안구 포위 프레임 계산부가 생성한 가상 안구 포위 프레임에 근거하여 현재 프레임 안면 영상에서 영상을 파내어 안구 영역을 획득한다.
실제응용에서, 안구 영역 획득부는 파낸 영상을 특정크기로 축소 및 확대한 것을 식별한 안구 영역의 영상으로 더 할 수 있다.
일실시예에서 도 7과 같이, 위치 정보 확정모듈(703)은 모델 구축부(701), 행렬 계산부(702) 및 위치 정보 확정부(703)를 포함할 수 있다.
그 중에서, 모델 구축부(701)는 특징 확인부(602)가 확인한 2차원 특징에 연관되는 파라미터를 구하고, 파라미터에 기초하여 3차원 타겟 모델을 구축할 수 있다. 행렬 계산부(702)는 모델 구축부(701)가 구축한 3차원 타겟 모델 및 특징 확인부(602)가 확인한 2차원 특징을 이용하여 강성 행렬을 구할 수 있다. 위치정보 확정부(703)는 모델 구축부(701)가 구축한 3차원 타겟 모델 및 강성변환 행렬 계산부(702)가 구한 강성 행렬에 근거하여 3차원 위치 정보를 확정할 수 있다. 위에서 3차원 위치 정보의 확정장치 중의 각 모듈, 및 각 모듈의 각 부의 구체적 기능에 대해 설명하였고, 상기 3차원 위치 정보 확정 방법의 구체적 단계를 참조할 수 있기에 구체적인 설명은 생략한다.
일실시예에 따라, 안면 영상에서 식별한 안구 영역에서 2차원 특징을 확인한 후, 2차원 특징에 근거하여 3차원 타겟 모델을 확정하고, 3차원 타겟 모델에 기초하여 3차원 위치 정보를 확정할 수 있다. 더 나아가, 더 높은 정밀도를 위해 3차원 위치 정보는 디스플레이 스크린이 디스플레이 하는 3차원 영상을 조절 및 다시 랜더링할 수 있다.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.
소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.
실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.
이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Specific structural or functional descriptions of the embodiments are disclosed for illustrative purposes only, and may be modified and implemented in various forms. Therefore, the embodiments are not limited to the specific disclosed form, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical spirit.
Although terms such as first or second may be used to describe various components, such terms should only be construed for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element.
It should be understood that when an element is referred to as being “connected” to another element, it may be directly connected or connected to the other element, but other elements may exist in the middle.
Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, operation, component, part, or combination thereof exists, but one or more other features or numbers, It should be understood that the presence or addition of steps, operations, components, parts, or combinations thereof is not precluded.
Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in this specification, it should not be interpreted in an ideal or excessively formal meaning. don't
Embodiments described below may be used to recognize a user's fingerprint. Hereinafter, an operation of recognizing a user's fingerprint may include an operation of authenticating or identifying the user. An operation of authenticating a user may include, for example, an operation of determining whether the user is a pre-registered user. In this case, the result of the user authentication operation may be output as true or false. The operation of identifying the user may include, for example, an operation of determining which user the user corresponds to among a plurality of pre-registered users. In this case, a result of the operation for identifying the user may be output as an ID of any one pre-registered user. If the user does not correspond to any of the plurality of pre-registered users, a signal indicating that the user is not identified may be output.
The embodiments may be implemented in various types of products such as personal computers, laptop computers, tablet computers, smart phones, televisions, smart home appliances, intelligent vehicles, kiosks, and wearable devices. For example, the embodiments may be applied to authenticating a user in a smart phone, mobile device, smart home system, and the like. The embodiments may be applied to a payment service through user authentication. In addition, the embodiments may be applied to an intelligent vehicle system that automatically starts the engine by authenticating a user. Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. Like reference numerals in each figure indicate like elements.

1 is a flowchart illustrating a method for determining 3D location information according to an exemplary embodiment.
A method for determining 3D location information according to an embodiment may include identifying an eyeball region in a face image (S101).
In the above step, the eyeball may be identified from the face image, and the identified part may be extracted from the face image and identified as an eyeball region.
At this time, in order to improve the efficiency and accuracy of eye region identification, a virtual eye encircling frame may be generated based on previous eye region identification results, and based on the generated virtual eye encircling frame, a face image of a current frame may be generated. It can be obtained by identifying the eye region.
In detail, the previous eye region identification result may be associated with location information of a 2D feature determined based on a previous facial image. Then, the image of the eyeball region obtained using the virtual eyeball enclosing frame generated by the location information of the 2D feature with relatively high precision and accuracy can improve the identification efficiency and accuracy of the eyeball region.
A method for determining 3D location information according to an embodiment may include determining 2D features in the eye area (S102).
In this step, the mobile device can identify 2D features in the eye area using a pre-trained Hessian matrix and residual terms; In the explanation below, the Heyman matrix and residual terms are collectively referred to as the Supervised Descent Method (Supervised Descent Method) model.
2 is a diagram illustrating 2D features acquired using a map descent model according to an embodiment.
The process of identifying two-dimensional features using a pre-trained supervised descent model extracts precise features such as local binary patterns in the eye area and uses the supervised descent model based on the extracted precise features. 2D features can be obtained by iterative calculation using
In one embodiment, in order to improve the accuracy of the 2D features calculated using the map descent model, the eye region may be obtained from the current face frame image based on the virtual eye encircling frame generated in step S101. An image of the identified eye region is obtained by reducing and enlarging the eye region to a specific size before determining the 2D features in the eye region.
A state in which both eyes of the user do not look straight at the camera, for example, a state in which the head is tilted to one side may be considered. After the eyeball region is identified in step S101, the acquired image of the eyeball region may be rotated at a certain angle so that the eyeball looks straight at the camera. Similarly, after calculating the 2D features of the eyeball region after rotation using the map descent model, the 2D features of the eyeball region acquired in step S101 may be determined by performing reverse rotation.
In one embodiment, in the step of checking the 2D feature in the eyeball region, whether the eyeball is in a stationary state may be determined by comparing frames before and after the eyeball region. When it is determined that it is not in a stationary state, a 2D feature can be confirmed in the eye area. When it is determined that the eyeball is in a stationary state, previously determined 3D positional information may be determined as current 3D positional information.
In order to determine whether or not the eyeball is in a stationary state, a frame of the current eyeball region is compared with a frame of a previous eyeball region to determine whether motion information between frames exists, and if there is no motion information between frames, the eyeball is stationary. status can be determined. Motion information between frames may be specifically determined based on pixel changes between images.
According to an embodiment, a standardized correlation coefficient between a current image frame of the eyeball region and a previous image frame of the eyeball region may be calculated, and the previous and subsequent frames may be compared; if the calculated standardization related coefficient is greater than the threshold value, it is determined that the eyeball is in a stationary state; Otherwise, it can be determined that the eyeball is not in a stationary state.
Specifically, a normalized correlation coefficient (NCC) between an image of the eye region of the current frame and an image of the eye region of the previous frame may be calculated based on Equation (1) below.

C(x, y) and P(x, y) are gray values of pixels at coordinates (x, y) in the current eye area image frame and the previous eye area image frame, respectively;

,

is the average gray value of the image frame of the current eye area and the image frame of the previous eye area, respectively.
FIG. 3 illustrates standardization related coefficients of 50 image frames for an eye area according to an embodiment.
Referring to FIG. 3 , it can be seen that when the eyeball is stationary, the normalized correlation coefficient of the previous and subsequent frames has a very high value of 0.995 or more. It can be seen that when the blinking motion appears, the normalized related coefficient is significantly lowered, and as shown in FIG. 3, the 32nd frame is lowered to 0.78 compared to the previous normalized related coefficient. Accordingly, it can be seen that when the threshold value is taken as 0.99, it is possible to discriminate between the eyeballs in the resting state and the active state.
In order to improve the accuracy of 3D location information, reliability evaluation may be performed on 2D features. Reliability evaluation can be performed on the obtained supervised descent model based on the local binary pattern feature extracted in the process of determining the 2D feature and the location information of the 2D feature determined by the local binary pattern feature.
In the process of determining the 2D features using the supervised descent model, local binary pattern features are extracted, and based on the extracted local binary pattern features, the 2D features are repetitively calculated using the pretrained supervised descent model. Check.
According to an embodiment, after classifying the local binary pattern features using a previously trained support vector machine (Support Vector Machine) classifier, the local binary pattern features are the local binary patterns of the actual eye area. It can be checked whether the characteristic can be reflected. At this time, if it is determined that the actual local binary pattern feature is reflected, the evaluation result matches the 2D feature, otherwise the evaluation result may not match the 2D feature. Similarly, when the evaluation result matches the 2D feature, a 3D target model is determined based on the 2D feature in step S103. In addition, when the evaluation result does not match the 2D feature, it returns to steps S101 and S102, and the steps of identifying the eye region and confirming the 2D feature are repeated.
In one embodiment, the support vector machine classifier may obtain positive and negative sample features. Here, an eyeball region that can accurately reflect the actual shape of the eyeball may be acquired as a positive sample, and an eyeball region that may not accurately reflect the actual shape of the eyeball may be measured as a negative sample. For each, a local binary pattern feature extracted from the eye area measured with the positive sample is taken as the defining sample feature; A local binary pattern feature extracted with respect to the eye area measured with the negative sample may be determined as a negative sample feature.
4 shows a 3D target model according to one embodiment.
A method of determining eyeball position information according to an embodiment may include determining a 3D target model based on 2D features (S103).
In the above step, a parameter related to the 2D feature determined in step S102 may be obtained. Specifically, the ocular parameters based on the model equations as

Wow

save

Among them,

is a common eyeball three-dimensional model, which can be obtained from a common three-dimensional face model.

is the common 3-dimensional average eye shape, S represents a shape vector, which is a vector of face shape, A represents expression, which is a vector for expressions,

Wow

may be eye parameters corresponding to the shape vector S and the expression vector A, respectively. In practice, a morphological vector S is obtained based on a previously photographed face image, and an expression vector A is obtained based on a plurality of previously photographed facial images.
Continuing, ocular parameters

Wow

, the commonly used three-dimensional average eye shape

, 3D target model based on the shape vector S and expression vector A obtained by pre-fitting

is built, which is shown in FIG.
The method for determining eyeball position information according to an embodiment may further include determining 3D position information based on a 3D target model (S104).
Specifically, a stiffness matrix is obtained using the 3D target model and the 2D features determined in step S102, and 3D positional information is determined based on the stiffness matrix and the 3D target model fitted in step S103. can do.
In this step, the stiffness matrix can be obtained through minimization of the target function as follows.

Among them, P is a perspective projection transformation matrix, Q is a stiffness transformation matrix,

is a three-dimensional target model;

is a two-dimensional feature.
After determining the 3D location information in step S104 according to an embodiment, a 3D display can be performed through the following steps S105 and S106.
The method for determining eyeball position information according to an embodiment may further include converting the 3D position information into a 3D coordinate system of a display of a mobile device (S105).
The 3D location information determined through step S104 may be based on the 3D coordinate system of the camera that captures the eyeball, and the 3D image viewed by the user may be based on the 3D coordinate system of the display screen. Accordingly, the 3D location information can be converted into the 3D coordinate system of the display screen of the mobile device based on the stiffness matrix of the 3D coordinate system of the camera and the 3D coordinate system of the display screen.
A method of determining eyeball position information according to an embodiment may include adjusting or re-rendering 3D image content to be displayed based on a conversion result ( S106 ).
In practice, the 3D positional information of the eyeball on the display screen of the mobile device can be determined according to the internal parameters of the camera, such as the position of the main point of the camera, the focal length of the camera, etc., and the 3D positional information after conversion. Subsequently, based on the 3D location information on the display screen, it is possible to realize an accurate 3D image that the user can view from the current location by adjusting or re-rendering the display contents.
5 illustrates a flow of a method for training a supervised descent model according to an embodiment.
In fact, the supervised descent model presented in step S102 may have been trained in advance. The training method for the supervised descent model is as shown in FIG. 5 and the flowchart specifically includes the following steps.
The step of determining the map descent model according to an embodiment may include acquiring a sample region by measuring an eyeball region in a captured sample image (S501).
In the above step, a sample facial image of the user may be previously collected using an image collecting device (eg, a camera). While collecting the sample facial images, the camera may capture the user's face in various postures to obtain the sample facial images. Subsequently, an eye area may be identified for each sample image and a sample area may be acquired. Furthermore, after collecting sample regions, sample features may be acquired for each sample region.
Determining the supervised descent model according to an embodiment may include repeatedly training the supervised descent model using sample features measured in the sample region ( S502 ). Specifically, the training process of the supervised descent model can be divided into two stages: an initial iteration stage and a subsequent iteration stage, based on the precedence relationship of repeated times. In order to improve the accuracy of the supervised descent model, in an initial iteration step, approximate features are extracted from the sample area and the supervised descent model is trained; In subsequent iterations, we can train a supervised descent model by extracting precise features for the sample region. Approximate features are Histogram of Oriented Gradient (HOG) features, Multiple block Local Binary Pattern (MBLBP) features, Speed Up Robust Features (SURF) features, ORB features ( oriented FAST and rotated BRIEF (ORB) feature).
The precise feature includes at least one of a local binary pattern feature, a Gabor feature, a Discrete Cosine Transformation (DCT) feature, and a BRIEF feature (Binary Robust Independent Elementary Features).
In one repetition process, sample areas of different dimension spaces may be acquired by reducing or enlarging the sample area according to the set reduction or enlargement ratio. Subsequently, features can be extracted for sample regions of different dimension spaces, and training can be performed on the supervised descent model obtained in the previous iteration process. The features extracted for the sample area may be coarse features or precise features. In this case, the supervised descent model acquired in each dimension space is trained, and the output result of the supervised descent model trained in the dimension space may be 2-dimensional features acquired in the sample area of the dimension space.
The output result of the supervised descent model trained in each dimension space can be compared with each pre-measured sample feature; The supervised descent model corresponding to the output result with the highest similarity can be used in the next iteration. In actual application, for the supervised descent model with the highest similarity output result, the dimension space of this area used by the supervised descent model can be trained to become the optimal dimension space of the supervised descent model; In addition, the sample area having the optimal dimension space is set as the sample area of the next iteration process.
As we have seen, through the use of two types of features, the supervised descent model is pre- and post-trained, the optimal dimension space is selected each time it is repeatedly trained, and the supervised descent model trained in the selected optimal dimension space is next. Accuracy of 2D features can be improved based on repeated training.

6 illustrates a structure of an apparatus for determining 3D location information according to an embodiment.
An apparatus for determining 3D location information according to an embodiment may include an identification unit 601 , a feature confirmation unit 602 , and a 3D location information determination unit 603 .
The identification unit 601 may identify the eyeball region in the face image. The feature checking unit 602 may check the 2D features in the eye area identified by the identifying unit 601 . Specifically, the feature checking unit 602 may check the 2D feature in the eye area using a pre-stored map descent model.
The 3D location information determining module 603 determines the 3D target model based on the 2D features determined by the feature confirmation unit 602, and then determines the 3D location information based on the determined target 3D eye module. can do.
Considering that when a user watches a video, the head most of the time is in a relatively fixed position, while the eyes are the same with respect to the position of the display screen of the mobile device; Accordingly, in order to improve identification efficiency of the 3D location information, preferably, the device for determining the 3D location information may further include an eye stop determining unit 604 .
The still determination unit 604 determines whether the eyeball is in a stationary state based on the similarity between the frames before and after the eyeball region identified by the identification unit or motion information between the frames, and outputs a determination result.
Specifically, the still determination unit 604 determines whether the eyeball is in a stationary state based on the similarity between the front and back frames of the eyeball area or motion information between the frames. For example, a normalization correlation coefficient between an image in the eye region of the current frame and an image in the eye region of the previous frame is calculated, and when the standardization correlation coefficient exceeds a set threshold value, it is determined that the eyeball is in a stationary state.
Correspondingly, the judgment result output from the eyeball stopping determination unit 604 by the feature checking unit 602 determines the 2D feature in the eyeball area if the eyeball is not in a stationary state. The determination result output from the eyeball stopping determining unit 604 by the feature checking unit 602 is that if the eyeball is in a stationary state, the previously determined 3D positional information is used as the 3D positional information of the captured current frame face image.
Furthermore, the 3D position information determined by the 3D position information determining module 603 is based on the 3D coordinate system of the camera of the mobile device, so that the eyeball acquires the 3D position of the display of the mobile device, The apparatus for determining the 3D location information may further include a coordinate system conversion module 605 and a display content control module 606 .
Among them, the coordinate system conversion unit 605 converts the 3D position information determined by the 3D position information determination unit 603 into the 3D coordinate system of the display screen; The display content control module 606 may adjust or re-render the 3D image content displayed on the display screen based on the 3D location information after the conversion of the coordinate system conversion module 605. In order to improve the accuracy of the finally determined 3D location information, the 3D location information determination device may further include a reliability evaluation unit (not shown).
The reliability evaluation unit acquires the local binary pattern features extracted in the process of determining the two-dimensional features by the feature checking unit 602 using the map descent model; In addition, reliability evaluation is performed on the local binary pattern features obtained by using the classifier based on the location information on the 2D features determined by the feature checker 602.
Specifically, the trust evaluation module classifies the local binary pattern features extracted from the trust evaluation module using the previously stored support vector machine classifier, and then the local binary pattern features accurately reflect the local binary pattern features of the actual eye area. you can check if it does. If it is determined that the actual local binary feature is reflected, the evaluation result matches the 2D feature, otherwise the evaluation result may not match the 2D feature. Similarly, the 3D location information determination unit 603 may determine a 3D target model based on the 2D feature when the evaluation result matches the 2D feature. Furthermore, if the evaluation result indicates that the reliability of the 3D location information is low, the reliability evaluation module may cause the identification unit 601 to identify the eyeball region again.
According to another embodiment, the apparatus for determining 3D position information may further include a support vector machine classifier training module.
The support vector machine classifier training module uses a local binary pattern feature extracted from the eye area measured as a positive sample as a defining sample feature; The local binary pattern feature extracted from the eye area measured with the negative sample is taken as the negative sample feature; Train a support vector machine classifier using positive and negative sample features.
In the embodiment of the present invention, the map descent model used by the feature checking unit 602 is stored in advance and can be trained with other equipment, and can also be trained in advance with a device for determining 3D location information.
Therefore, preferably, the apparatus for determining the 3D location information may further include a map descent model training module (not shown).
The supervised descent model training module sets the eyeball region detected from the captured sample face image as the sample region, and can repeatedly train the supervised descent model using sample features measured in the sample region. Specifically, the supervised descent model training module may specifically include a sample collection unit, an initial iteration unit, and a subsequent iteration unit. Among them, the sample collection unit identifies the eyeball region in the sample face image as the sample region. The initial iteration unit may extract approximate features for the sample area acquired by the sample collection unit in the initial iteration step, and may train a supervised descent model using the extracted approximate features. The subsequent iteration unit may extract precise features for the sample region acquired by the sample collection unit in the subsequent iteration step, and may train a supervised descent model using the extracted precise features. Approximate features include Histogram of Oriented Gradient (HOG) features, Multiple block Local Binary Pattern (MBLBP) features, Speed Up Robust Features (SURF) features, ORB features, etc. It may include at least one feature of The precise feature includes at least one of a local binary pattern feature, a Gabor feature, a Discrete Cosine Transformation (DCT) feature, and a BRIEF feature (Binary Robust Independent Elementary Features).
In one iteration process, the initial iteration unit or the subsequent iteration unit may obtain sample areas of different dimension spaces by reducing and enlarging the sample area according to the set reduction and enlargement ratio. Subsequently, features can be extracted for sample regions of different dimension spaces, and training can be performed on the supervised descent model obtained in the previous iteration process. In this case, the supervised descent model obtained in each dimension space may be trained, and the output result of the supervised descent model trained in the dimension space may be a 2-dimensional feature acquired in a sample area of the dimension space. Finally, the output results of the supervised descent model trained in each dimension space are compared with the pre-measured sample key points for similarity; The supervised descent model corresponding to the output result with the highest similarity is applied to the next iteration.
In this case, the supervised descent model is trained precedently and posteriorly through the use of the two kinds of features; In each iteration of training, the optimal dimension space is selected, and the supervised descent model trained in the optimal dimension space is used as the basis for the next iteration of training. The feature accuracy can be improved.
In an embodiment of the present invention, in order to improve the identification efficiency and accuracy of the eyeball region, the identification unit 601 may identify the eyeball region based on previous eyeball positioning results.
Specifically, the identification unit 601 may include an eyeball enclosing frame calculation unit and an eyeball region acquisition unit. The eye-encircling frame calculation unit may generate a virtual eye-encircling frame based on previous eye positioning results. The previous eyeball positioning result may be associated with location information of a 2D feature determined based on a previous facial image.
The eye region acquiring unit obtains the eye region by extracting an image from the face image of the current frame based on the virtual eye surrounding frame generated by the eye enveloping frame calculation unit.
In actual application, the eye region obtaining unit may add the image of the identified eye region by reducing or enlarging the excavated image to a specific size.
In one embodiment, as shown in FIG. 7 , the location information determination module 703 may include a model building unit 701 , a matrix calculation unit 702 and a location information determination unit 703 .
Among them, the model builder 701 may obtain parameters related to the 2D features confirmed by the feature checker 602 and build a 3D target model based on the parameters. The matrix calculator 702 may obtain a stiffness matrix using the 3D target model built by the model builder 701 and the 2D features checked by the feature checker 602 . The location information determination unit 703 may determine 3D location information based on the 3D target model constructed by the model construction unit 701 and the stiffness matrix obtained by the stiffness transformation matrix calculation unit 702 . Each module in the 3D position information determination device and specific functions of each module have been described above, and specific steps of the 3D position information determination method can be referred to, so detailed descriptions are omitted.
According to an embodiment, after confirming 2D features in the eye region identified in the facial image, a 3D target model is determined based on the 2D features, and 3D location information may be determined based on the 3D target model. have. Furthermore, for higher precision, the 3D location information can adjust and re-render the 3D image displayed on the display screen.

The embodiments described above may be implemented as hardware components, software components, and/or a combination of hardware components and software components. For example, the devices, methods and components described in the embodiments may include, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). array), programmable logic units (PLUs), microprocessors, or any other device capable of executing and responding to instructions. A processing device may run an operating system (OS) and one or more software applications running on the operating system. A processing device may also access, store, manipulate, process, and generate data in response to execution of software. For convenience of understanding, there are cases in which one processing device is used, but those skilled in the art will understand that the processing device includes a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it can include. For example, a processing device may include a plurality of processors or a processor and a controller. Other processing configurations are also possible, such as parallel processors.
Software may include a computer program, code, instructions, or a combination of one or more of the foregoing, which configures a processing device to operate as desired or processes independently or collectively. The device can be commanded. Software and/or data may be any tangible machine, component, physical device, virtual equipment, computer storage medium or device, intended to be interpreted by or provide instructions or data to a processing device. , or may be permanently or temporarily embodied in a transmitted signal wave. Software may be distributed on networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer readable media.
The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program commands recorded on the medium may be specially designed and configured for the embodiment or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. - includes hardware devices specially configured to store and execute program instructions, such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.
As described above, although the embodiments have been described with limited drawings, those skilled in the art can apply various technical modifications and variations based on the above. For example, the described techniques may be performed in an order different from the method described, and/or the components of the described system, structure, device, circuit, etc. may be combined or combined in a different form than the method described, or other components may be used. Or even if it is replaced or substituted by equivalents, appropriate results can be achieved.

Claims

프로세서를 포함하는 컴퓨팅 장치에 의하여 안구의 위치 정보를 확정하는 방법에 있어서,
안면 영상에서 안구 영역을 식별하는 단계;
상기 안구 영역에 대한 2차원 특징을 확인하는 단계; 및
상기 2차원 특징에 기초하여 상기 안구 영역에 대한 3차원 타겟 모델을 확정하고, 상기 3차원 타겟 모델에 기초하여 상기 안구 영역에 대한 3차원 위치 정보를 확정하는 단계
를 포함하고,
상기 안구 영역에서 2차원 특징을 확인하는 단계는,
로컬 바이너리 패턴(local binary pattern)에 기초한 지도 하강법(Supervised Descent Method) 모델을 확정하는 단계; 및
상기 지도 하강법 모델을 이용하여 상기 안구 영역에서 2차원 특징을 확인하는 단계
를 포함하는 방법.A method for determining position information of an eyeball by a computing device including a processor, the method comprising:
identifying eye regions in the facial image;
identifying a two-dimensional feature of the eye region; and
Determining a 3D target model for the eyeball region based on the 2D feature, and determining 3D location information for the eyeball region based on the 3D target model.
including,
The step of identifying the two-dimensional feature in the eye region,
Determining a Supervised Descent Method model based on a local binary pattern; and
Identifying a 2D feature in the eye region using the supervised descent model
How to include.

제1항에 있어서,
상기 2차원 특징에 기초하여 상기 3차원 타겟 모델을 확정하는 단계는,
상기 2차원 특징에 연관되는 파라미터를 구하는 단계; 및
상기 파라미터에 기초하여 상기 3차원 타겟 모델을 구축하는 단계
를 포함하는 방법.According to claim 1,
The step of determining the 3-dimensional target model based on the 2-dimensional feature,
obtaining a parameter associated with the 2-dimensional feature; and
Building the 3D target model based on the parameters
How to include.

제1항 또는 제2항 중 어느 한 항에 있어서,
상기 3차원 타겟 모델에 기초하여 상기 3차원 위치 정보를 확정하는 단계는,
상기 3차원 타겟 모델 및 상기 2차원 특징을 이용하여 행렬을 구하는 단계; 및
상기 행렬에 기초하여 상기 3차원 위치 정보를 확정하는 단계
를 포함하는 방법.According to any one of claims 1 or 2,
The step of determining the 3D location information based on the 3D target model,
obtaining a matrix using the 3D target model and the 2D features; and
Determining the 3D location information based on the matrix
How to include.

제1항에 있어서
상기 안구 영역에서 2차원 특징을 확인하는 단계는,
상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계; 및
상기 안구가 정지 상태가 아닌 것을 판단한 경우, 상기 안구 영역에서 상기 2차원 특징을 확인하는 단계
를 포함하는 방법.According to claim 1
The step of identifying the two-dimensional feature in the eye region,
comparing frames before and after the eyeball region to determine whether the eyeball is in a stationary state; and
If it is determined that the eyeball is not in a stationary state, identifying the 2D feature in the eyeball region
How to include.

제4항에 있어서,
상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계는,
상기 안구가 정지 상태인 것을 판단한 경우, 이전에 확정한 3차원 위치 정보를 현재의 3차원 위치 정보로 확정하는 단계
를 더 포함하는 방법.According to claim 4,
The step of determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region,
If it is determined that the eyeball is in a stationary state, determining previously determined 3D positional information as current 3D positional information
How to include more.

제4항에 있어서,
상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 단계는,
현재의 안구 영역의 영상 프레임과 이전의 안구 영역의 영상 프레임 사이의 표준화 관련 계수(NCC: normalized correlation coefficient)를 계산하는 단계; 및
상기 표준화 관련 계수가 임계 값을 초과할 때, 상기 안구가 정지상태임을 판단하는 단계
를 포함하는 방법.According to claim 4,
The step of determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region,
calculating a normalized correlation coefficient (NCC) between a current eye region image frame and a previous eye region image frame; and
Determining that the eyeball is in a stationary state when the standardization related coefficient exceeds a threshold value
How to include.

제1항에 있어서,
상기 3차원 위치 정보를 디스플레이의 3차원 좌표계에 전환하는 단계; 및
상기 전환 결과에 기초하여 상기 디스플레이의 3차원 영상을 조절 또는 다시 랜더링하는 단계
를 더 포함하는 방법.According to claim 1,
converting the 3D location information into a 3D coordinate system of a display; and
Adjusting or re-rendering the 3D image of the display based on the conversion result
How to include more.

삭제delete

제1항에 있어서,
상기 지도 하강법 모델을 확정하는 단계는,
샘플 영상에서 안구 영역을 측정하여 샘플 영역을 획득하는 단계; 및
상기 샘플 영역에서 측정한 샘플 특징을 이용하여 지도 하강법 모델에 대하여 반복 트레이닝 하는 단계
를 포함하는 방법.According to claim 1,
The step of determining the supervised descent model,
acquiring a sample area by measuring an eye area in the sample image; and
Iteratively training a supervised descent model using sample features measured in the sample region.
How to include.

제9항에 있어서,
상기 지도 하강법 모델을 반복 트레이닝을 하는 단계는,
초기 반복 단계에서 샘플 영상 내 안구 영역에 대응하는 샘플 영역에 대하여 대략적 특징을 추출하는 단계; 및
후속 반복 단계에서, 상기 샘플 영역에 대하여 정밀한 특징을 추출하는 단계
를 포함하고,
상기 대략적 특징은 방향경사 히스토그램 특징, 다중 블록 로컬 바이너리 패턴 특징, 스피드업 로버스트 특징, ORB 특징 중의 적어도 하나의 특징을 포함하고;
상기 정밀한 특징은 로컬 바이너리 패턴 특징, 가보어 작은 파도 특징, 이산 코사인 변환 특징, 바이너리 로버스트 독립 기초 특징 중의 적어도 하나의 특징을 포함하는 방법.According to claim 9,
The step of repeatedly training the supervised descent model,
extracting rough features for a sample region corresponding to the eyeball region in the sample image in an initial iteration step; and
In a subsequent iteration step, extracting precise features for the sample region.
including,
the coarse feature includes at least one of a directional gradient histogram feature, a multi-block local binary pattern feature, a speedup robust feature, and an ORB feature;
wherein the precise feature comprises at least one of a local binary pattern feature, a Gabor wave feature, a discrete cosine transform feature, and a binary robust independent basis feature.

제10항에 있어서,
상기 반복 단계는,
서로 다른 치수 공간에서, 상기 샘플 영역에 대하여 특징을 추출하고, 이전의 반복과정에서 획득한 지도 하강법 모델을 트레이닝 하는 단계;
각 치수 공간에서 트레이닝한 지도 하강법 모델을 각각 미리 측정한 샘플 영역의 특징과 비교하는 단계; 및
상기 비교하는 단계의 결과에 따라, 복수의 지도 하강법 모델 중 어느 하나의 지도 하강법 모델을 선택하여 다음 번 반복과정에 사용하는 단계
를 포함하는 방법.According to claim 10,
The iterative step is
extracting features for the sample area in different dimension spaces and training a supervised descent model obtained in a previous iteration process;
comparing the supervised descent model trained in each dimension space with the features of each pre-measured sample area; and
Selecting one of the plurality of supervised descent models according to the result of the comparing step and using it in the next iteration process.
How to include.

제1항에 있어서,
상기 안면 영상에서 상기 안구 영역을 검출하는 단계는,
상기 안구의 위치를 결정하고, 상기 결정된 위치에 기초하여 가상의 안구 프레임을 생성하는 단계; 및
상기 가상의 안구 프레임에 기초하여 현재 프레임에 대한 안면 영상으로부터 상기 안구 영역을 획득하는 단계
를 포함하고,
상기 결정된 위치는 2차원 특징의 위치 정보에 관련되는 방법.According to claim 1,
The step of detecting the eyeball region in the face image,
determining a position of the eyeball and generating a virtual eyeball frame based on the determined position; and
obtaining the eyeball region from a face image for a current frame based on the virtual eyeball frame;
including,
wherein the determined location is related to location information of a two-dimensional feature.

안구 위치 정보 확정 장치에 있어서,
안면 영상에서 안구 영역을 식별하는 식별부;
상기 안구 영역에 대한 2차원 특징을 확인하는 특징 확인부; 및
상기 2차원 특징에 기초하여 상기 안구 영역에 대한 3차원 타겟 모델을 확정하고, 상기 3차원 타겟 모델에 기초하여 상기 안구 영역에 대한 3차원 위치 정보를 확정하는 3차원 위치 정보 확정부
를 포함하고,
상기 특징 확인부는,
로컬 바이너리 패턴(local binary pattern)에 기초한 지도 하강법(Supervised Descent Method) 모델을 확정하고,
상기 지도 하강법 모델을 이용하여 상기 안구 영역에서 2차원 특징을 확인하는
장치.In the eyeball position information determining device,
an identification unit identifying an eye area in the face image;
a feature checking unit that checks 2-dimensional features of the eye region; and
A 3D location information determination unit for determining a 3D target model for the eyeball region based on the 2D feature, and determining 3D location information for the eyeball region based on the 3D target model.
including,
The feature confirmation unit,
Confirm a Supervised Descent Method model based on a local binary pattern,
Identifying two-dimensional features in the eye region using the supervised descent model
Device.

제13항에 있어서,
상기 3차원 위치 정보 확정부는,
상기 2차원 특징을 확정하여 파라미터를 구하고, 상기 파라미터에 기초하여 상기 3차원 타겟 모델을 구축하는 모델 구축부;
상기 3차원 타겟 모델 및 상기 2차원 특징을 이용하여 행렬을 구하는 행렬 계산부; 및
상기 3차원 타겟 모델 및 상기 행렬에 기초하여 3차원 위치 정보를 확정하는 위치 정보 확정부
를 포함하는 장치.According to claim 13,
The 3D location information determining unit,
a model building unit determining the 2-dimensional feature to obtain a parameter, and constructing the 3-dimensional target model based on the parameter;
a matrix calculator calculating a matrix using the 3D target model and the 2D features; and
A location information determining unit for determining 3D location information based on the 3D target model and the matrix.
A device comprising a.

제13항 또는 제14항에 있어서,
상기 안구 영역의 전, 후 프레임을 비교하여 상기 안구가 정지 상태인지 판단하는 정지 판단부
를 더 포함하고,
상기 특징 확인부는,
상기 안구가 정지 상태가 아닌 경우, 상기 안구 영역에서 2차원 특징을 확인하는
장치.According to claim 13 or 14,
A stop determination unit for determining whether the eyeball is in a stationary state by comparing frames before and after the eyeball region.
Including more,
The feature confirmation unit,
When the eyeball is not in a stationary state, confirming a two-dimensional feature in the eyeball region
Device.

제15항에 있어서,
상기 특징 확정부는,
상기 안구가 정지 상태인 경우, 이전에 확정한 3차원 안구 위치 정보를 현재의 3차원 안구 위치 정보로 하는
장치.According to claim 15,
The feature determination unit,
When the eyeball is in a stationary state, the previously determined 3D eyeball position information is used as the current 3D eyeball position information.
Device.

제15항에 있어서,
상기 정지 판단부는,
현재의 안구 영역의 영상 프레임과 이전의 안구 영역의 영상 프레임 사이의 표준화 관련 계수(NCC: normalized correlation coefficient)를 계산하고, 상기 표준화 관련 계수가 임계 값을 초과할 때, 상기 안구가 정지상태임을 판단하는
장치.According to claim 15,
The stop determination unit,
Calculate a normalized correlation coefficient (NCC) between an image frame of the current eye region and an image frame of the previous eye region, and determine that the eyeball is in a stationary state when the normalized correlation coefficient exceeds a threshold value doing
Device.

제13항에 있어서,
상기 3차원 위치 정보 확정부가 확정한 상기 3차원 위치 정보를 디스플레이의 3차원 좌표계에 전환하는 좌표계 전환부; 및
상기 3차원 위치 정보에 근거하여 상기 디스플레이의 3차원 영상을 조절 또는 다시 랜더링하는 영상 조절부
를 더 포함하는 장치.According to claim 13,
a coordinate system conversion unit for converting the 3-dimensional location information determined by the 3-dimensional location information determining unit into a 3-dimensional coordinate system of a display; and
An image control unit that adjusts or re-renders the 3D image of the display based on the 3D location information
A device further comprising a.

제13항에 있어서,
상기 3차원 위치 정보 확정부는,
상기 2차원 특징에 연관되는 파라미터를 획득하고,
상기 파라미터에 기초하여 상기 3차원 타겟 모델을 구축하는
장치.According to claim 13,
The 3D location information determining unit,
obtaining a parameter associated with the two-dimensional feature;
Building the 3D target model based on the parameters
Device.

제19항에 있어서,
촬영한 샘플 영상에서 안구 영역을 측정하여 샘플 영역을 확정하고, 상기 샘플 영역에서 측정한 샘플 특징을 이용하여 지도 하강법 모델을 반복 트레이닝을 하는 지도 하강법 모델 트레이닝부
를 더 포함하는 장치.
According to claim 19,
A supervised descent model training unit that determines the sample area by measuring the eye area in the captured sample image and repeatedly trains the supervised descent model using the sample features measured in the sample area.
A device further comprising a.